Investigation into the predictive performance of colorimetric sensor strips using RGB, CMYK, HSV, and CIELAB coupled with various data preprocessing methods: a case study on an analysis of water quality parameters

The potential use of colorimetric sensors has received significant attention due to its feasibility for use in various applications. After reacting with a sample, the image of the colorimetric sensor can be captured and converted into digital data using several different color models. The analytical data can then be processed with various chemometric methods. This research study investigated the predictive performance of calibration models established using color models commonly used in analytical chemistry including RGB, CMYK, HSV and CIELAB. A total of eight commercially available colorimetric sensors were used to determine the presence of manganese (Mn2+), copper (Cu2+), iron (Fe2+/Fe3+), nitrate (NO3–), phosphate (PO43–), sulfate (SO42–), as well as total hardness and pH values. As external validation tests, real water samples collected in Chiang Mai, Thailand were used. Based on the resulting data obtained using the synthetic test samples, the color that was most similar to the appearing color of the chemical sensor could offer satisfactory results. However, it was not always the case especially when the strips composed of multiple colorimetric sensors or sensor array were used. When tested with external validation, the predictive performance could be improved using appropriate data preprocessing and, in this research study, a normalization method was recommended to guarantee the accuracy of the calibration models.


Introduction
In recent years, colorimetric sensors have received substantial attention because they are relatively costeffective and allow for chemical detection to be done without the need for complicated analytical instruments.
The detection of colorimetric sensors is based on the assumption that an analyte sample should interact with a chemoresponsive chemical sensor resulting in an intense color change. Colorimetric sensors are composed of either a single or several sensing chemicals that are coated on a supporting material. A plain paper or some hydrophobic material, such as polyethylene terephthalate film (LaGasse et al. 2014) and polyvinylidene fluoride (Kim et al. 2016), are among the common supporting materials where the sensor can be positioned as a narrow strip allowing for practical and easy use (Pla-Tolós et al. 2016). The color change could be simply detected using unaided eyes for the purposes of screening or semi-quantitative analysis. To improve the detection ability, the images of the colorimetric sensor strips can be captured using a digital camera or an ordinary scanner. After that, the sensing colors can be converted into digital coded values of various color models allowing the recorded images to be digitally analyzed by simple processing tools such as mobile phones, laptop computers and/or miniature computing devices based on the Raspberry Pi platform (Pla-Tolós et al. 2016;Cantrell et al. 2010).
In analytical chemistry, RGB (red, green, blue), CMYK (cyan, magenta, yellow, black), HSV (hue, saturation, value), and CIELAB are the among the most common color models used to display the color spectrum (Cantrell et al. 2010;Sharifzadeh et al. 2014;Peng et al. 2020;Ravindranath et al. 2018;Wang et al. 2020). RGB is regarded as an additive model wherein color is a combination between the three primary colors of red, green, and blue. On the other hand, CMYK is a subtractive model meaning that color can be generated by subtracting the primary colors from white light. For example, white minus red is cyan. Magenta is as a result of white minus green. HSV is another color-coding system that represents color using a conical geometric shape comprised of hue (H), saturation (S), and value (V). The parameters of H are used to indicate colors using angular dimensions. For instance, red, green, and blue are located at 0°, 120°, and 240°, respectively. The horizontal axis of the HSV model utilizes the saturation of color to indicate the grayness or the intensity of white or black, where the S value ranges from 0 (white/black) to 1 (pure color). On the vertical axis, the V parameter or lightness describes the amount of darkness in the color. A lightness value of 0 represents black and transitions toward increased lightness all the way to 1. In CIELAB or CIEL*a*b*, a color is expressed using three parameters; a*, b*, and lightness (L). The parameter L indicates the lightness of the color in a similar way to the V parameter used in HSV. Additionally, the a* and b* parameters indicate the amount of green-red and blue-yellow color components, respectively (Korifi et al. 2013).
The use of these color models depicts color as a multidimensional model resulting in data of a multivariate nature. For example, a pH test strip composed of four pH indicators generates a total of 12 variables comprised of three digitally coded RGB parameters that have been determined from four different chemical-sensing mechanisms (3 × 4). In this circumstance, the analytical decision should be made based on the use of all available data in a multivariate manner. To quantitatively analyze the samples, multivariate linear regression (MLR) and partial least squares (PLS) regression are usually used in addition to the more traditional univariate linear regression (ULR) (Brereton 2007). In previously published literature, various color models have been differently applied to solve analytical problems. For example, RGB has been used to extract the concentrations of trimethylamine (TMA) in meat-borne substances using PLS (Xiao-wei et al. 2016). Gonzalez-Miret et al. also reported on the use of CIELAB in analyzing the concentrations of mineral contents in honey based on the established MLR equation (González-Miret et al. 2005). On the other hand, HSV could be used to effectively predict the degree of deprotonation of the indicator molecule under different membrane thickness conditions (Cantrell et al. 2010). In addition, the color models can be applied to monitor the changes in fluorescent sensing for detecting alkaline phosphatase activity (Upadhyay et al. 2020a), Cr 3+ , Cu 2+ , and Hg 2+ (Upadhyay et al. 2018;Anand and Sahoo 2019). Since each of the color parameters contains a fraction of information that could be differently related to the predicted response, selecting an appropriate color model can be a very important matter that can affect the predictive performance of the prediction models.
In addition to the choice of the color models, several data preprocessing methods, such as scaling, standardization, and normalization, could also influence the predictive accuracy of the developed models. The use of different data preprocessing methods can reveal different aspects of the data (Brereton 2003). The appropriate data preprocessing method could ensure that the right trends are being studied, and a further analysis of the methods will prevent the confusion caused by the emergence of nonessential information. Using the standardization method, variations of each studied parameter can be adjusted to appear approximately on the same level so that they can be more easily compared. On the other hand, the data normalization method can be used to balance the overall variations of each sample. For example, a row scaling normalization method mathematically adjusts the summation of the parameter values to be a constant value, e.g., 1. Although the use of the data preprocessing method can dramatically affect the predictive performance of the calibration models, its potential use has not yet been much studied from an analytical perspective of this type of the data.
This research study aimed to investigate the effects of various color models on the prediction performance of calibration models that had been constructed using colorimetric sensors. A set of commercially available colorimetric sensors used to monitor some quality parameters, such as the concentrations of manganese ion (Mn 2+ ), copper ion (Cu 2+ ), iron ion (Fe 2+ /Fe 3+ ), nitrate (NO 3 -), phosphate (PO 4 3-), sulfate (SO 4 2-), pH, and total hardness, were used for the purpose of demonstration. The test samples were made up of synthetic solutions and natural waters. In addition, the data preprocessing methods that seriously affected the predictive performance of the calibration models were investigated. This research study has emphasized that the choice of color models is important to the quantitative analysis using colorimetric sensors, while the selection of the data preprocessing methods could have significant influence on the predictive performance of the calibration models.

Colorimetric sensors
In this research study, the detected analytes were the chemical properties used to assess natural water quality including manganese ion (Mn 2+ ), copper ion (Cu 2+ ), iron ion ( 3was based on the color changes of a single chemical sensor. The detection of SO 4 2-, total hardness, and pH were determined based on 4, 5, and 4 chemoresponsive sensors, respectively. The details of the colorimetric strips are summarized in Table 1. These strips were ideally designed to serve as a screening test for semi-quantitative purposes. For example, the color changes of the Mn 2+ sensor strip were calibrated in accordance with the presence of Mn 2+ in discrete concentration ranges of 0, 2, 5, 20, 50, and 100 ppm.

Standard reagents
Chemical solutions, such as nitric acid (HNO 3 ), acetic acid (CH 3 COOH), ascorbic acid (C 6 H 8 O 6 ), sodium acetate (CH 3 COONa), sodium phosphate (Na 2 HPO 4 ), and ammonium acetate (CH 3 COONH 4 ), were prepared using analytical-grade chemicals (Merck, Germany). To construct the calibration models, a series of chemical solutions was prepared, the details of which are summarized in Table 1 , and PO 4 3solutions were based on the standard methods that had been previously described (Franson 1992). For each of the sensor strips, the solutions were separated into two sets. These were defined as a training set that was used to establish the calibration model and a test set that was used to validate the predictive performance of the established calibration model.

Acquisition of color data
The colorimetric sensor detection method assumes that the changed color of the sensing materials could be related to the analyte concentration. To make a comparison of predictive performance, the color change of the chemical strips was converted into digital data using various color models such as RGB, CMYK, HSV, and CIELAB.

RGB
RGB (red, green, blue) is regarded as the most common digitally coded color model and employs the three primary colors of red, green, and blue, as illustrated in Fig. 1a. RGB is an additive color model meaning that the presented colors are generated through combinations of the three primary colors. For example, yellow is representative  Krongchai et al. 2020). In this research study, the acquisition of the images was performed using a scanner (L220, Epson, Philippines). On the scanner, the resolution was set to 300 dpi with 8 bits of color depth. The RGB images were recorded in true image file format (TIFF) without data compression. The images were then processed with a set of scripts and the function was developed using Matlab 2010. In this process, the first step was to screen out the pixels that were not considered part of the sensors and, therefore, should not be included in the calculation (Cantrell et al. 2010). In this step, the R, G, and B values of each pixel were compared with a threshold fraction calculated using the values recorded from the images of each sensor. In this research study, an average intensity of ±3 standard deviations (SD) was used to set the thresholds. These values were obtained from all pixels of a selected sensing area of an image. The recorded intensities, which were either smaller or greater, were interpreted as background or outliers and were discarded from the analysis.
CMYK CMYK (cyan, magenta, yellow, black) color model consists of three secondary colors, cyan, magenta, and yellow, and is presented in Fig. 1b. In contrast to RGB, it is a subtractive color model as the final color is created by subtracting the primary colors from a white surface. Black or the key (K) is included to improve the density range and allow for the available color gamut of the generated color. Like RGB, CMYK is a device-dependent color model, meaning that the characteristics of each data acquisition tool could alter the recorded values. In other words, the recorded color intensities obtained from different scanners or cameras could be different. In this research study, CMYK values were converted from RGB values using the methods described previously (Cantrell et al. 2010). With 8 bits of color depth, color intensities ranged from 0 to 255. For example, absolute white is a combination of [0 0 0 0] in the CMYK color model.

HSV
HSV color model describes the color spectrum as a multidimensional model of three different factors including hue (H), saturation (S), and value (V) (Fan et al. 2019). The hue parameter specifies the color as an angular value between 0°and 360°. For example, red, green, and blue are located at 0°, 120°, and 240°, respectively (Thajee et al. 2018). The saturation or chroma (S), which is also known as purity, indicates the amount of color ranging from 0 to1% or 0 to 100%. A color with 100% saturation is the purest color possible; whereas, with less saturation, a grayer color will appear. The value of V describes the brightness or intensity of the color ranging from 0 to 1, where 0 refers to absolute black and 1 indicates 100% brightness with no black mixed into the color at all. These three components can be visualized using a color geometric cone as illustrated in Fig. 1c, where hue is a given point on the color wheel. Saturation is represented as the radius of the cone, while the value component is represented as the height of the color combination cone.
CIELAB CIELAB or CIE L*a*b* is a color model that imitates human perception of image colors (Sharifzadeh et al. 2014). CIELAB color model characterizes colors using three components. The first component is the lightness of color (L*), and the others are the chromatic components (a* and b*) as shown in Fig. 1d. Usually, the L* value ranges from 0 to 100, where L* = 0 and 100 will refer to black and white colors, respectively. The a* value defines the distance through the red-green axis, and the positive and negative values indicate red and green, respectively. On the other hand, the b* value defines the distance through the yellow-blue axis, where the positive and negative values indicate yellow and blue. The scaling and limits of the a* and b* axes depend upon the specific implementation, and they run in the range of ± 110 for the 8-bits of color depth. This color model is similar to the HSV color model where the intensity is decoupled from the chromaticity. Because the tonality changes are linear, the chromaticity differences can be computed using the Euclidean distance. Therefore, the CIELAB is considered a device-independent color model (Sharifzadeh et al. 2014).

Data preprocessing of the decoded digital values
Data preprocessing involves applying a mathematical modification to the analytical values prior to initiating the data analysis methods. A numerical dataset may be shifted or rescaled in accordance with the variables/samples to place emphasis on the desired information obtained from the analysis. The use of a data preprocessing method could seriously affect not only the predictive ability but also the interpretation of the models (Famili et al. 1997). Common data preprocessing methods can be applied prior to application of the calibration model. These data pre-processing methods include data scaling, standardization, normalization and various distancebased methods. In this research study, the decoded digital values obtained from I samples with the J color parameters were organized into a data matrix X. The value of x ij was positioned in the ith row and the jth column element of the data matrix X. The preprocessed data were presented in a data matrix Z.

Basic data preprocessing
To analyze the color spectrum data, many researchers have reported on the use of raw data without applying any data preprocessing method for prediction purposes (Pla-Tolós et al. 2016). In many cases, simple mathematical modifications can also be performed prior to making a prediction. Color change can be evaluated using the different intensities obtained from before and after chemical exposure (Zaragozá et al. 2015). The difference between the RGB values obtained from the before and after images was calculated using the absolute value of the color change (Tahir et al. 2016). Some studies have reported on the use of the differences in the values after the colors were converted into a gray scale format (Morsy et al. 2016). In addition, the obtained color intensities were subtracted by the background where no optical dyes were present in order to compensate for the non-uniformity in the background light (Chen et al. 2017). Lastly, the logarithm of the ratio between the color intensity determined from before and after the reaction was calculated to represent the effective intensity (De Almeida et al. 2015).

Standardization
Standardization is an extension of mean centering and can be carried out by dividing each variable by its standard deviation after mean centering (Brereton 2003).
where x is the mean value of intensities in the jth variable. After standardization, all the variables will be adjusted on the same scale. This can be useful when there are some variables that might be abundant in all samples, but when their variation is not very significant. For example, the change in concentration of a minor chemical might be significant in relation to the underlying system. Standardization can be used to ensure that the significance of the variables with low intensity is comparable to those of higher intensities. Without standardization, the predictive results may rely only on the variables having relatively high intensities.

Normalization
Normalization can be done by dividing each element of a sample by a constant. If a sample is scaled to a constant total, this means that the sum over all variables after normalization is equal to 1. In this research study, each sample was normalized to the summation of the square values of the color intensities as follows: Normalization can be useful if it is difficult to precisely control the absolute intensities of the samples. The variations in each experimental run that may be not completely controlled could be minimized after data normalization. After normalization, each parameter contains information about the relative proportion of the compounds in the sample, and a comparison between the samples reveals information about how the ratios between the compounds differ.

Dissimilarity-based methods
Instead of using the decoded digital values directly for the calculations, it was possible to evaluate the color change using the dissimilarity between the colors of the tested sensors and the background of the reference images (Hoang et al. 2017). The color dissimilarities can be evaluated using Euclidean and Mahalanobis distances. The Euclidean distance for sample i (represented by row vector x i ) is calculated to the centroid of x c by where z 2 is the squared Euclidean distance between sample i and the centroid. In applications of CIELAB, this is a linear distance of the 3-dimensional space in the color components. The Mahalanobis distance is regarded as a standardized Euclidean distance and can be calculated by where S is a diagonal matrix corresponding to the standard deviation of each variable.

Regression methods
In this research study, quantification was achieved based on the color models using single linear regression (ULR), multivariate linear regression (MLR), and partial least square (PLS) regression.

ULR and MLR
ULR is a simple linear regression that describes the relationship between one predictive (independent) and one response (dependent) variable into a first-degree polynomial regression assuming the data distribution has a linear association as follows: where y is a response variable or, in this research study, a concentration of the sample analyte which is an estimation based on a single-color parameter of x. The a and c parameters are the slope and intercept of the univariate linear regression line, respectively. MLR is similar to ULR in that the prediction of the response is based on the simple linear regression. However, the influence of more than one independent variable contributes to the estimation of the response variable with the use of simple matrix operations: where y is a vector containing the response values estimated by the predictive data contained in matrix X, while b is a vector containing the coefficient of each predictive variable indicating the relationship between the predictive variables in estimation of the y responses. The calculation of b using the pseudo-inverse matrix operation has been effectively described previously (Brereton 2009).

PLS
PLS is a multivariate calibration model that captures both variations from the predictive (X) and response (y) data (Albayrak et al. 2019). A prediction is carried on based on the assumption that the covariance between those extracted variations is maximized (Brereton 2007). Therefore, in most cases, PLS is regarded as a powerful calibration method that could provide satisfactory predictive results (Xiaowei et al. 2015;Bordbar et al. 2018).
In this research study, the prediction of PLS was achieved following the PLS1 algorithm that has been previously described (Brereton 2003). The number of PLS latent vectors was optimized using leave-one-out cross-validation (LOO-CV) (Alberti et al. 2020).

Statistical merits
The predictive performance level achieved in this research study was evaluated using the root mean square errors of calibration (RMSEC), the root mean square errors of prediction (RMSEP), the cross-validated explained variances (R 2 and Q 2 ) and the ratio between the RMSEP and RMSEC values (R x ). The RMSEP is calculated as follows: whereŷ i is the predicted concentration for the test sample x i having the actual value of y i . The number of samples is donated by I. The RMSEC value can be calculated using the same equation as the RMSEP value but by also using the training samples. The Q 2 and R 2 values for the predictive results of the test samples and training samples, respectively, can be calculated by using the same equation: Ideally, the predicted concentration should lie along the diagonal line indicating that the predicted and actual concentration are the same. In this case, the R 2 or Q 2 values will be high and close to 1, and this implies that the greater degree of variation within the data is modeled by the calibration model. In addition, the ratio of RMSEP and RMSEC are calculated as follows: This statistical ratio can be used to indicate the robustness of the developed model. It is expected that the ratio will be as close to 1 as possible. A value that is higher than 1 implies that the model could be prone to an overfitting problem. The predictive result could be inconsistent when there is no change in the training samples, or when it is used to estimate the unknown variations of the test samples.

Real samples
Four water samples were collected in Chiang Mai, Thailand representing the three water types that were investigated in this study. These included sewage canal (W1), wastewater treatment pond (W2), and water reservoirs (W3 and W4). The water samples were collected in December 2018. The concentration of metal ions was determined using the Air-Acetylene Flame method (Rice et al. 2017). The turbidimetric method was used to measure the concentrations of SO 4 2- (Rice et al. 2017).
NO 3 was assessed following the Brucine colorimetric method (George 2012). The pH values were measured using a pH meter (SevenCompact pH/Ion meter S220, Mettler-Toledo, Switzerland). Total hardness was evaluated using the titrimetric method (Rice et al. 2017). Lastly, an ascorbic acid method was used to quantify the concentrations of PO 4 3- (Rice et al. 2017).

Results and discussion
Image visualization using different color models , pH, and total hardness was based on colorimetric strip sensor array composed of 4, 4, and 5 sensing elements, respectively. Figure 3 displays the images of a pH strip visualized by the four different color models wherein the differences can hardly be noticed. Color models can be used to visualize the same color spectrum, although different color parameters need to be used. However, differences could be observed when images were decomposed and presented through the use of an individual color parameter. The RGB (Fig. 3a) and CMYK (Fig. 3b) models characterized the image sensors based on the primary and secondary colors, respectively. On the other hand, the colors were characterized by the three different color factors employed in cases involving HSV and CIELAB.  Fig. 4. Due to the fact that each of the color models were characterized by different numbers of parameters, the scores of each dataset were adjusted through Procrustes transformation so that the different types of the measurements could be comparable and presented within the same PCA space (Andrade et al. 2004). The linearity trends can be observed in the color data acquired from most colorimetric strips. For example, the concentrations of Mn 2+ increased along with the increases of the PC1 and PC2 scores. Therefore, the samples were distributed along the diagonal line of the PCA space implying the sensible use of the linear calibration methods. The scattering trend in the total hardness detection could be due to the fact that the strip was originally fabricated for the purposes of determining the concentration levels of CaCO 3 or for scanning purposes (see concentration ranges presented in Table 1). In each PCA score plot, when a sample was presented using different color models, they were located differently on the PCA space. This implied that variations in the different color data could create alternations in the predictive performance when they were used to establish the applicable calibration models.

Effect of different color models
To compare the predictive abilities when the calibration models were calibrated using different color parameters, the ULR models were constructed using the training samples of each colorimetric strip. In this comparison, the digitally coded values were used without any data pretreatment (no data preprocessing). For the sensor array with multiple chemical detections, the multivariate data was calibrated using the PLS method. The correlation between the actual and predicted concentrations of the test samples for each of the colorimetric strips are illustrated in Fig. 5. The predictive results of the established models are also summarized in Supplementary Materials Table S1.

Prediction of colorimetric strips using a single chemical sensor
In Fig. 5a-e, variations in the predictive results can be observed in the ULR prediction using different color parameters. For example, in the detection of Mn 2+ (Fig.  5a), the ULR calibration model based on the G parameter or the greenness of the image resulted in the lowest RMSEP value of 4.14 with a Q 2 value of 0.9813. As a result, the samples were placed closer along the diagonal line of the correlation graph. On the contrary, the ULR calibration model using the B parameter resulted in relatively higher predictive errors with RMSEC and RMSEP values of 26.92 and 18.58, respectively. For detection of Mn 2+ , PO 4 3-, and NO 3values, the colors that were most similar to the appearing color of the chemical sensor could offer satisfactory results. For example, the detections of PO 4 3and Mn 2+ , where the sensor change can be correlated to an increase in the color green, the lowest RMSEP value was reported with the ULR model using the G parameter. Identically, the ULR model based on the detection of the color magenta (M parameter) resulted in the best detection of NO 3 -. In this case, the color of the strip was changed from white to magenta. Though, it is not always the case, the detections of Fe 2+ / Fe 3+ and Cu 2+ were achieved by the appearance of royal purple (blue-magenta) and dark magenta, respectively. However, K and G parameters could result in whereas the use of the M parameter, which was supposed to be more closely related to changes in Cu 2+ concentrations, resulted in a relatively higher predictive error of 47.89 of RMSEP. With regard to the use of HSV and CIELAB, in this situation, they obtained poorer predictive performance levels resulting in low Q 2 and high RMSEP values as presented in Supplementary Materials Table S1. These color modes decomposed the color spectrum into different light components. For example, in HSV, the H parameter represents the actual color, and S and V represent color saturation and brightness, respectively. Since the H parameter mainly indicates the difference between the colors, it could not effectively detect changes in color intensity. For instance, in the detection of Cu 2+ (Fig. 5c), where the sensor changed from light magenta to dark magenta, the use of the H parameter resulted in a significantly high prediction error (RMSEP of 88.84) when compared with the use of the B parameter (RMSEP of 35.83). In addition, when the light component parameters (S and V parameters) were separately used for the purposes of prediction, they could be more  affected by the inconsistency of the light source of the scanner. The inconsistency of the light source will be demonstrated and discussed later in the next section.

Prediction of colorimetric strips using multiple chemical sensors
In addition, with regard to the detections of the SO 4 2-, pH, and total hardness using chemical strips composed of more than one sensing element, the multivariate regression using PLS could outperform both of ULR and MLR models (Supplementary Materials Tables S7-S9). Variations could still be observed when the predictions were based on different color models. For example, in the detection of SO 4 2-, the PLS-RGB model produced relatively lower RMSEP values when compared with the PLS-CMYK model yielding RMSEP values of 94.25 and 206.69, respectively. Interestingly, in the case where intensity data were used without any data preprocessing, the prediction errors based on the multivariate use of the HSV or CIELAB color models were not apparently high when compared with those based on individual uses (Supplementary Materials Table S1). These predictive outcomes demonstrate the advantage of the multivariate calibration where all of the recorded parameters were simultaneously used to establish the prediction model in order to achieve the best predictive performance.

Effects of different data preprocessing methods
To investigate the effects of different data preprocessing methods on the predictive performance of the colorimetric calibration models, all preprocessing methods were applied to each of the digitally coded data prior to establish the model constructions. The predictive results are presented in Supplementary Materials Tables S2-S9. The colorimetric prediction models which offered the top five lowest RMSEP values are listed in Table 2 for comparison.
In addition to the choice of the color model used, a different preprocessing method could dramatically affect the predictive performance of the calibration models. Compared with the established model employing raw data (no data preprocessing), the predictive results could be either improved or diminished when various other data preprocessing methods were applied. For example, in the determination of NO 3 -, the RMSEP value was decreased from 44.31 to 28.28 after the RGB values were treated with logarithm scaling prior to the PLS modeling. In contract, using the HSV values, the PLS model resulted in higher prediction error values with an RMSEP value of 59.58 when data were standardized prior to analysis.

Prediction of colorimetric strips using a single chemical sensor
The detections of Fe 2+ /Fe 3+ , Cu 2+ , and PO 4 3values were performed using the colorimetric strips with a single chemical sensor. In this research study, the calibration models employed without any data preprocessing of the colorimetric strips could result in the lowest predictive errors of the RMSEPs. These prediction models were based on a simple linear regression of the change in color intensity.
In the case of Fe 2+ /Fe 3+ determination, although the use of ULR provided the smallest RMSEP value of 5.80, the prediction of the training samples was relatively poor while the RMSEC value was as high as 12.60. Consequently, this model was prone to the underfitting problem with an R x ratio of 0.48. In this situation, it was recommended that a more stable model be used with a slight increase in the RMSEP value. For example, the RGB-PLS model should be employed with absolute color changes as a preferred data preprocessing method. However, the errors in the prediction of the test samples were slightly increased to 8.02. In the detection of the Fe 2+ /Fe 3+ , MLR and PLS values provided comparably predictive results. Both of the models gave identically satisfactory predictive results with high R 2 and Q 2 values and low RMSEC and RMSEP values. This was the case because all parameters can be significant for model prediction and because the optimum number of PLS latent variables were close or the same as the number of the parameters used.
In contrast to the detections of Cu 2+ , Fe 2+ /Fe 3+ , and PO 4 3-, the lowest errors identified in the prediction process of the Mn 2+ and NO 3 concentrations were related to the utilization of the multivariate modelings of PLS and MLR. Notably, the best predictive results of Mn 2+ were obtained from the RGB-PLS model with standardization. However, it may not be recommended that spectroscopy data, such as UV-Vis and NIR, be pretreated by standardization (Brereton 2009). After standardization, the variations of all variables were adjusted to be within the same range. Since the absorbance of not all wavelengths contained useful information in relation to the predicted responses or they presented only pattern of noise, the standardization process might amplify the noise variations, and this undesirable variation could then lead to wrongful predictions. In this case, although the PLS model with standardization could result in the lowest RMSEP value of 3.42, the R x value of 2.28 was considered relatively high. This implied that the errors in the auto prediction process and those of the test modes were too different and the model was prone to an overfitting problem (Wongsaipun et al. 2018). Therefore, the second best PLS model, wherein the RGB values were subtracted by the background (R- Rwb, G-Gwb, B-Bwb), should be used instead to avoid this problem of inconsistency. The determination of NO 3 demonstrated that the color data put through an appropriate data preprocessing step could positively affect the predictive performance of the model. In this case, the CMYK with the background subtraction could offer the best predictive performance of the synthetic test samples. Normally, PLS could provide satisfactory predictive results. However, in this case, only the comparable predictive results between PLS and MLR were obtained. This could have occurred because in situations where the number of variables were not high, PLS could result in similar outcomes since the number of latent variables were not quite different or were the same as the number of the parameters in the original data space.

Prediction of colorimetric strips using multiple chemical sensors
The determinations of SO 4 2-, pH, and total hardness were based on the colorimetric strips equipped with multiple chemical sensors. To utilize all of the colorimetric sensor arrays, ULR would not be practical since it can only model one independent variable each time for a prediction of the response. Although the MLR model could be calculated, the calibration algorithm resulted in high predictive errors in all cases. This was due to the fact that the numbers of the modeling samples were lower than the numbers of the predictive variables; therefore, the pseudo-inverse method was determined to be mathematically inappropriate (Saxena and Prathipati 2003). Therefore, PLS provided the best predictive results with low predictive errors for the colorimetric strips equipped with multiple chemical sensors. The high value of Q 2 with low R x implied that the developed model could be practically used for the prediction of unknown or test samples. Table 3 presents the predictive results of the sensor strips used for determining the concentrations of the selected water monitoring parameters using the PLS calibration models. By the standard methods, the concentrations of Mn 2+ , Fe 2+/3+ , Cu 2+ , NO 3 -, PO 4 3-, and SO 4 2were very low or non-detectable. Therefore, to evaluate the accuracy of the developed calibration models, the analytes were spiked into the samples, and the predictive results were reported as the %recovery.  Table S4. Predictive performance of Cu 2+ test strips using ULR, MLR and PLS coupled with various forms of data-reprocessing. Table S5. Predictive performance of NO 3 test strips using ULR, MLR and PLS coupled with various forms of data-reprocessing. Table S6. Predictive performance of PO 4 3test strips using ULR, MLR and PLS coupled with various forms of data-reprocessing. Table S7. Predictive performance of SO 4 2test strips using ULR, MLR and PLS coupled with various forms of data-reprocessing. Table S8. Predictive performance of pH test strips using ULR, MLR and PLS coupled with various forms of data-reprocessing. Table S9. Predictive performance of total hardness test strips using ULR, MLR and PLS coupled with various forms of data-reprocessing.