Co-simulated Size Number: An Elegant Novel Algorithm for Identification of Multivariate Geochemical Anomalies

Identification of geochemical anomalies is of particular importance for tracing the footprints of anomalies. This can be implemented by advanced techniques of exploratory data analysis, such as fractal/multi-fractal approaches based on priori or posteriori distribution of geochemical elements. The latter workflow involves analysis of 2D/3D produced maps, which can be mostly obtained by geostatistical algorithms. There are two challenging issues for such an analysis. The first one corresponds to handling the cross-correlation structures among the data, and the second one relates to the compositional nature of data. To tackle these problems, this paper investigates the application of Gaussian co-simulation for modeling the cross-correlated compositional data in order to recognize the multivariate geochemical anomalies in integration with fractal analysis. In this context, an innovative algorithm, namely co-simulated size number (CoSS-N), is introduced for this purpose. The compositional nature of data is addressed by additive log-ratio transformation of original data while the Gaussian co-simulation handles the reproduction of cross-correlation among the components. The co-simulated outputs are then taken into account for capturing different geochemical populations, showing different levels of backgrounds and anomalies. The algorithm is illustrated via a real case study located in Philippine wherever seven geochemical components are required to be considered. The accuracy of results is examined by statistical validation techniques, indicating the capability of the CoSS-N algorithm for multivariate identification of geochemical anomalies.


INTRODUCTION
Identification of geochemical anomalies by separation of geochemical data into different populations is an essential aspect of mineral exploration. Unlike traditional approaches of anomaly recogni-tion, which are based only on statistical parameters, fractal/multi-fractal modeling takes into account not only the frequency distribution of data but also spatial deviation of data from background due to ''self-similar'' fractal geometry (Mandelbrot 1983;Feder 1988;Cheng 2012;Sadeghi et al. 2012Sadeghi et al. , 2015He et al. 2013;Luz et al. 2014;Zuo and Wang 2016).
Fractal analysis for delineation of geochemical anomalies can be classified mainly into two groups (Madani and Sadeghi 2019). The first group deals with global distribution of a variable, in which the input to fractal analysis is based on sample frequency. For instance, the number-size model de-scribes the relationship between number of objects (i.e., samples) and their sizes, which can be element concentration or metal grade (Mandelbrot 1983;Agterberg 1995). Although this methodology has vast applications in the analysis of geochemical data for mineral exploration, the spatial distributions of sample locations are not absolutely effective in capturing geochemical anomalies. In the particular case of scarce data, the sample distributions may not represent reliable statistical parameters of data populations, leading to bias in the inference of thresholds for defining data populations. In this case, the algorithm proposed by Madani and Sadeghi (2019) using Monte Carlo simulation algorithm (GSS-N) overcomes the problem of scarce data and thus facilitates straight-line fitting on log-log plots, leading to less bias in the identification of thresholds for defining data populations.
The conditional or posteriori distribution of a geochemical variable is the basis of the second group of fractal analysis related to the description of sample data that are mapped in either regular or irregular grid locations. For instance, the concentration-area (C-A) and concentration-perimeter (C-P) fractal models (Cheng et al. 1994(Cheng et al. , 1996, spectrum-area (S-A) fractal model (Cheng et al. 1999(Cheng et al. , 2000, concentration-volume (C-V) fractal model (Afzal et al. 2010), spectrum-volume (S-V) fractal model (Afzal et al. 2011), wavelet-number (W-N) fractal model (Chen and Cheng 2018) and simulated size-number (SS-N) fractal model (Sadeghi et al. 2015) pertain to this group. The last model, which was developed based on integration of simulated values and their number (or frequency), combines geostatistical simulation and fractal modeling to identify reliable thresholds. For this model, Gaussian simulation, either by sequential Gaussian simulation (SGS) (Almeida and Journel 1994) or by turning bands simulation (TBSIM) (Emery and Lantué joul 2006) as the two widely used geostatistical algorithms in uncertainty quantification, is recommended for construction of a stochastic local distribution that is suitable in distinguishing depositrelated geochemical anomalies. The SGS relies on the sequential paradigm of simulating Gaussian random fields conditioned to the set of data derived from a simple kriging exercise (Chilè s and Delfiner 2012). Although this approach is considered reliable in geostatistical modeling, it has major difficulty in ascertaining moving neighborhood and in reproducing short-scale continuity (Lantué joul 1994). To address these shortcomings of SGS, it has been shown that simulation results can be optimized by increasing the size of neighborhood ranges (Emery 2004). The TBSIM, the fundamental concept of which is based on simulation of one-dimensional Gaussian random fields and spanning those to d-dimensional random fields, has the ability to address the issue of increasing neighborhood ranges. However, it may create stripping artifacts in the case of few turning lines, and so the idea is to increase the number of turning lines to weaken this kind of artifacts (Lantué joul 1994;Gneiting 1999;Emery and Lantué joul 2006). These two geostatistical simulation algorithms are adequate for univariate spatial modeling; however, when one is dealing with more than one cross-correlated variable, the modeling becomes challenging. One reason is that these univariate simulation approaches insufficiently take into account the dependency characteristics among variables and subsequently lead to deficient results. In the case of dealing with cross-correlated variables, however, co-simulation algorithms are highly advocated. Sequential Gaussian co-simulation (Gómez-Herná ndez and Journel 1993;Pebesma 2004) and turning bands co-simulation or TBCOSIM (Carr andMyers 1985, 1989;Emery 2008), for example, offer much flexibility to construct outcomes (scenarios) that preserve inter-relationships between variables while spatial variability is preserved. Comparing and contrasting the performance of these two algorithms, Paravarzar et al. (2015) demonstrated that TBCOSIM outperforms sequential Gaussian co-simulation with respect to reproducing the cross-correlation calculated by spatial continuity and statistical parameters. Broadly speaking, Gaussian (co)-simulation approaches are the offshoot of deterministic approaches such as kriging. Although this most commonly used interpolation method is robust, it suffers from some impediments such as smoothing effect such that maps of estimated variables show squeezed distribution compared to the distribution of original data (Chilè s and Delfiner 2012; Rossi and Deutsch 2014;Afzal et al. 2015;Battalgazy and Madani 2019). This impacts over-and underestimation of de-clustered original low and high values in data, respectively. Moreover, using this biased result for further examination, such as fractal analysis and capturing of anomalies, leads to serious biased inference of essential thresholds (Sadeghi et al. 2015).
It is quite often that Gaussian co-simulation of cross-correlated variables is dealt with geochemical compositional data. In this case, the implementation

14
Madani and Carranza of conventional geostatistical modeling on the original scale of these data may be problematic and produce inconsistent results (Reimann et al. 2008;Pawlowsky-Glahn et al. 2015;Pawlowsky-Glahn and Egozcue 2016). In this respect, two main issues must be addressed: one related to closure problem (Aitchison 1986) and the other concerns spatial continuity analysis of compositional data (Pawlowsky-Glahn 1984;Pawlowsky-Glahn and Egozcue 2016). Considering the spatial structure of data means that the covariance structure of sample points should be taken into account in compositional data, and whenever it is recognized through the Aitchison geometry with Euclidean vector space structure, the additive log-ratio transformation becomes an oblige coordinate system with only limited use in multivariate statistical analysis . A simple solution, namely logarithmic transformation of original data (Reimann et al. 2008), although well suited for right-skewed distributions, is not appropriate for tackling the closure problem. Instead, log-ratio transformation developed in the 1980s for this purpose (Aitchison and Shen 1980;Aitchison 1982Aitchison , 1986 (Aitchison 1986); (2) centered log-ratio (clr) transformation (Aitchison 1986); and (3) isometric log-ratio (ilr) transformation (Egozcue et al. 2003). Other approaches were introduced based on converting original compositional variables by stoichiometric closure characteristic (Mery et al. 2017;Adeli et al. 2018), which is not ascribed to logarithmic transformation.
In this paper, we propose an innovative algorithm, namely co-simulated size-number (CoSS-N), based on integration of geostatistical co-simulation of cross-correlated compositional data with fractal/multi-fractal analysis to enhance characterization of multivariate geochemical anomalies. This approach is illustrated through a case study pertaining to a geochemical exploration campaign in the Philippines. The geochemical compositions consist of seven components (arsenic, copper, zinc, nickel, cobalt, manganese and fill-up (or un-determined component)).
The main aim of this research is sixfold: (1) to present methodologies for fractal analysis, geostatistical co-simulation and compositional data analysis; (2) to propose an innovative algorithm, namely co-simulated size-number (CoSS-N); (3) to trial this algorithm on a real case study and discussing the relevant statistical analysis; (4) to compare results obtained from Gaussian co-simulation with independent simulation through global and local statistical validation techniques; (5) to derive multivariate geochemical anomalies by CoSS-N; and (6) to combine Gaussian co-simulation results and inferred thresholds for probabilistic description of anomalous areas for predictive mapping of epithermal Au deposits.

Number-Size Fractal Model (N-S)
The N-S fractal model was proposed by Mandelbrot (1983) for describing the distribution of geochemical data. According to this model, there is a relation between the number and size parameters of evaluated data, which can be expressed as (Mandelbrot 1983): where q denotes element concentration, Nð! qÞ denotes cumulative number of samples with concentration values greater than or equal to q, F is a constant and D is the scaling exponent or fractal dimension of the distribution of element concentrations. According to Mandelbrot (1983), log-log plots of Nð! qÞ vs. q show straight-line segments with different slopes, ÀD, corresponding to different concentration intervals. Agterberg (1995) proposed a ''concentrationsize'' multi-fractal model based on the N-S model in order to determine and describe the spatial distribution of geochemical attributes in large mineral deposits. Monecke et al. (2005) used the N-S model to describe geochemical data, which indicate enrichment of minerals by replacement due to metasomatic processes resulting in the formation of hydrothermal deposits in the Waterloo massive sulfide deposit, Australia. The power-law frequency model, which has been suggested based on the N-S model, measures the frequency distribution of element and mineral concentrations based on the 15 Co-simulated Size Number number of samples (Li et al. 1994;Turcotte 1996;Shi and Wang 1998;Zuo et al. 2009;Sadeghi et al. 2012). The first 3D modeling work based on the N-S fractal model was demonstrated by Sadeghi et al. (2012) for the separation of mineralized zones and wall rocks, and the precision, and thus, the applicability of the model was proven by comparison of results with those of the concentrationvolume (C-V) fractal model. The most important advantage of this model is that there is no need for preprocessing of data for pre-estimation before modeling. Sadeghi et al. (2015) presented the simulated size-number (SS-N) approach considering a Monte Carlo simulation over the local distribution. In this respect, different geostatistical simulation algorithms such as (TBSIM) (Emery and Lantuéjoul 2006) or SGS (Almeida and Journel 1994) have been recommended for construction of such a stochastic local distribution that can be used subsequently for distinguishing deposit-related anomalies. Sadeghi et al. (2015) used the TBSIM in their study for probabilistic modeling and further analysis of SS-N model, and they showed that the SS-N model outperforms the C-V model in terms of local and global statistical parameters reproduction. However, this algorithm is valid either when correlated covariates do not exist or when the global distribution and local distribution of the underlying variables are independent in the region.

Turning Bands Co-simulation
The existence of cross-correlation among certain variables motivates one to use Gaussian cosimulation approaches rather than independent simulation (Wackernagel 2003;Chilè s and Delfiner 2012;Eze et al. 2019;Abildin et al. 2019). The reason relates to taking into account the inter-dependency characteristic among certain variables, for which it leads to generate results that reproduce the local and global multivariate statistical parameters of data in the area of study. Among the Gaussian cosimulation techniques, TBCOSIM has received wide acceptance among practitioners because of its versatility and straightforwardness and it outperforms other Gaussian co-simulation algorithms with regard to reproducing cross-correlation parameters (Paravarzar et al. 2015). Therefore, in this study, we used TBCOSIM to show the capability of the proposed algorithm.
TBCOSIM is an extension of TBSIM, indicating an approximate algorithm based on the multi-Gaussian distribution assumption of the underlying random field as first introduced by Matheron (1973) and then extended in some organized program codes (Lantué joul 1994;Emery and Lantué joul 2006). The principal concept of this algorithm is based on, first, drawing plenty of lines with random orientations and, second, simulating a one-dimensional Gaussian random field along each line (Lantué joul 1994(Lantué joul , 2002. In other words, the concept of the turning bands algorithm is to make simpler the problem of simulation in R 3 or R 2 into an R problem. The random field fYðxÞ; x 2 R d g in Eq. 1 has zero mean and isotropic covariance C Y (Eq. 2), where U is a uniform distribution over S d , which is unit sphere of R d .
YðxÞ ¼ Xðhx; UiÞ for any location of 8x 2 R d ð2Þ where h and i is the inner standard products in R d , h is a vector of R d , x d is a uniform distribution over S d and r is modulus of R d . Matheron (1973) first introduced and proved Eq. 2 as a relationship between continuous and isotropic covariances in R d with continuous covariances in R. By using this algorithm, one can substitute multi-dimensional simulation by one-dimensional simulation. Having the covariance model fitted to the primary de-clustered normal score variable, the covariance function is derived from one-dimensional random fields. TBSIM provides a non-conditional multi-dimensional random field compatible with the target covariance model, in which the simulated values are practically standard Gaussian (Emery and Lantué joul 2006). In order to generate the conditional realizations, the non-conditional simulation obtained should be progressed through one postprocessing of kriging step (Journel and Huijbregts 1978;Emery 2008;Chilè s and Delfiner 2012).
In TBCOSIM, it is of interest to simulate stochastically the cross-correlated variables (more than two). In this case, the cross-covariance function is needed to construct such one-and multi-dimensional Gaussian random fields in the region. The non-conditional step is the same as TBSIM per variable; however, as part of the conditioning mechanism, co-kriging must be used rather than 16 Madani and Carranza kriging in order to keep the multivariate characteristics (Carr and Myers 1985;Myers 1989;Emery 2008); thus: where Y SCK ðxÞ is the simple co-kriging of YðxÞ from the conditioning data; Y S ðxÞ is the non-conditional simulation at location x for variables; Y SCK S ðxÞ is the simple co-kriging of the non-conditional simulation from its value at the data locations. This simple cokriging system can also be substituted for ordinary co-kriging paradigm (Emery 2007).
The general workflow in the latter case is similar to TBSIM, as previously explained; however, in the variogram analysis, since the co-kriging system is established on the basis of the cross-covariance matrix, it is necessary to calculate the direct and cross-variograms. In order to fit the theoretical direct and cross-variograms models, linear model of co-regionalization (LMC) as an alternative can be used to fit all experimental variograms as a linear combination of equivalent structures together with the identical ranges, but different in sills (Wackernagel 2003; Chilè s and Delfiner 2012). The most tedious part of this job is to construct the permissible positive semi-definiteness conditions in fitting the sill matrices. Once this constraint is corroborated, the model can be used in the variance-covariance matrix required in the conditioning process (Goovaerts 1997).

Compositional Data Analysis
In modeling of cross-correlated variables, it is common to deal with compositional data. In this context, these data are variables of a set of components forming a whole such that their sum can be chosen arbitrarily, without loss of information as a positive constant D on the scale of the underlying random vector, such as 1 (parts per unit), 100%, 10 6 (parts per million) and 10 9 (parts per billion) (Aitchison 1986;Pawlowsky-Glahn et al. 2015). A vector of d components ZðxÞ ¼ fZ 1 ðxÞ; Z 2 ðxÞ; . . . ; Z d ðxÞg is a composition of (Aitchison 1986;Pawlowsky-Glahn et al. 2015): The compositional space satisfying the condition in Eq. (5) is known as simplex S d .
It is also common that the sum of components may not be bounded by D and one needs to introduce the sub-composition by complementing the given components with a further component, namely fill-up or filler variable (Tolosana-Delgado et al. 2005. This characteristic is of paramount importance if the fill-up variable has a physical interpretation, in which it represents the totality of elements that were not assayed (Filzmoser et al. 2009;Tolosana-Delgado et al. 2018).
The modeling strategy of compositional data by geostatistical algorithms entails using some transformation techniques in order to preserve the summation constraint. Because the geostatistical-based algorithms have been mostly developed for original data without constraints, they implicitly follow the assumption of Euclidean geometry. Therefore, in the case of constant sum constraint, the statistical parameters such as global and local correlation coefficients may be deemed unrealistic (Pawlowsky-Glahn 1984;Aitchison 1986;Pawlowsky-Glahn and Egozcue 2016;Tolosana-Delgado et al. 2018). To address this difficulty, several log-ratio transformations have been introduced to remove the positively and sum constant of compositional data. Among others, the alr transformation, as a widely used approach, converts original data into log ratios as (Aitchison 1982(Aitchison , 1986Pawlowsky-Glahn and Olea 2004): where the denominator ðZ d ðxÞÞ can be any of the components, yet the same one should be used for all the points and must be strictly positive. This denominator can be, for instance, the fill-up variable because it does not impact the final results of forward and backward transformation (Job 2010) and subsequently results in one less transformed variable than the original number of components. The reason for choosing alr transformation in this study relates to the suitability of spatial covariance structure analysis of a regionalized composition that is needed for the proposed algorithm (CoSS-N) (Pawlowsky and Burger 1992). The backward alr is represented (Aitchison 1982(Aitchison , 1986Pawlowsky-Glahn and Olea 2004) as: 17 Co-simulated Size Number Once the transformed components are obtained, the geostatistical algorithms can be incorporated to the regionalized components FðxÞ rather than original data and the modeled results should be back-transformed to the compositional space afterward . Geostatistical simulation of compositional data in a studied region provides alternative scenarios with equiprobable and representative compositional random function at unsampled locations taking into account the transformed conditioning data (Goovaerts 1997; Chilè s and Delfiner 2012; Talebi et al. 2019). Establishment of the co-kriging systems for co-simulation of transformed compositional data, however, requires modeling the direct and crossspatial continuity (e.g., variogram or covariance). This can be attained from fitting a linear model of co-regionalization carrying direct variograms in the diagonal and cross-variograms off-diagonal (Tolosana-Delgado and van den Boogaart 2013). In order to validate the simulation results, Mueller et al. (2014) illustrate that co-simulation reproduces global statistical parameters of the mean value, variance and distribution of both variables, either in original scale or in log-ratio transformed values, in the case wherever the log-ratio transformed data do not significantly deviate from multi-normality assumption. Furthermore, satisfactory or unsatisfactory reproduction of spatial continuity shows the same characteristics because the conventional cosimulation methods (

Co-simulated Size-Number Fractal Model (CoSS-N)
This proposed idea is based on relating the cosimulated size (CoSS) and number of involved compositional data (N) through integration of the TBCOSIM and N-S fractal analysis with the following equation: where q denotes element concentration, Nð! qÞ is cumulative number of samples with the average of simulated concentration values greater than or equal to q, F is a constant and D is the fractal dimension of the distribution of co-simulated element concentrations. In this model, a large number of possible scenarios can be reproduced using any Gaussian cosimulation algorithm (e.g., TBCOSIM) taking into account the global and spatial cross-correlation structure among the compositional data. The generated realizations not only reproduce the multivariate statistical parameters up to an acceptable level, but also more precisely distinguish different anomalies through fractal analysis by N-S. In the proposed algorithm ( Fig. 1), first, exploratory data analysis must be implemented to reveal the inherent characteristics of compositional data. In this regard, if the dataset is not bounded by a constant D, such as 1, 100 or 10 6 , a fill-up or filler variable, representing the not assayed elements, should 18 Madani and Carranza be considered. Second, the compositional data should be de-clustered prior to log-ratio transformation. Third is the transformation of original compositional random vectors by log ratio to obtain FðxÞ as in Eq. 6. Fourth is the transformation to normal score standard Nð0; 1Þ. Fifth is the (co)spatial continuity analysis in order to infer the linear model of co-regionalization. Sixth, apply the TBCOSIM algorithm over normal score data. Seventh is the back-transformation of realizations from normal score into first, compositional data BðxÞ (Eq. 7) and, second, original scale. Finally, apply the CoSS-N models on simulated results to capture geochemical anomalies.

DATASET
The geochemical dataset used here pertains to 135 stream sediment samples representing a total catchment area of $ 101 km 2 in the Aroroy gold district (Philippines). This is the same dataset used in Carranza (2008Carranza ( , 2010aCarranza ( , b, 2011. The data are concentrations of six elements (As, Cu, Zn, Ni, Co and Mn) determined from the stream sediment samples by atomic absorption spectroscopy after aqua regia decomposition (JICA-MMAJ 1986). Among the uni-element data, the As data contain censored values (i.e., values below the detection limit of 0.5 ppm) associated with 35 of 40 sample catchment basins in the northeastern part of the district underlain mainly by diorite, which is neither genetically nor spatially associated with the epithermal gold deposits in the district (cf. Mitchell and Leach 1991;Carranza 2009), and none of the sample catchment basins associated with censored As values contains any of the known epithermal gold deposits. The censored As values were replaced by one half the detection limit. Then, each step as stated in the ''Co-simulated Size-Number Fractal Model (CoSS-N)'' section was performed on the basis of the proposed algorithm (CoSS-N) for capturing geochemical anomalies over cross-correlated compositional data.

EXPLORATORY DATA ANALYSIS
Each of the variables in the dataset was assayed in a homotopic pattern, implying that each of them was measured from the same sets of sample points and all sample locations are in common (Wacker-nagel 2003). The location map of each variable is shown in Figure 2.
A straightforward tool for examining the spatial characteristic of data as a part of exploratory data analysis is trend analysis (Rossi and Deutsch 2014), which illustrates a cross-plot of the data against spatial coordinates. The interpretation of this plot in conjunction with the location map of data contributes to better understanding of the spatial variability of a variable being examined in a region. Figure 3 shows the variability of all six variables vs. principal coordinates, along west-east and southnorth directions. Consideration of the plots from west to east (Fig. 3a) reveals that As is significantly decreasing while Mn, Co and Zn are mildly decreasing. However, the variability of the elements is different in the south-north direction, along which As is increasing notably while Mn, Zn and Cu are smoothly decreasing (Fig. 3b). The other variables do not show significant variation along these directions (Fig. 4).
Statistical parameters are crucial for describing the global characteristics of geochemical variables and possibly compositional data. If the sampling pattern is irregular, it is of particular importance to de-cluster the dataset in order to compute more representative statistical parameters (Goovaerts 1997). In this technique, each datum receives a weight between 0 and 1 based on spatial closeness to the surrounding data. Histograms and all summary statistics can then be calculated from the weighted data (Rossi and Deutsch 2014). In this study, cell declustering (Deutsch and Journel 1998) as a commonly used de-clustering technique was applied. In this technique, the area should first be divided into a grid of cells. The weights then can be calculated according to the number of data falling in the same cell. Each cell occupied by sample points is assigned the same weight regarding the number of data while an unoccupied cell receives no weight. A dimension of 3000 9 3000 based on average sampling pattern is selected for the cell of interest to obtain the weights of each sample point. This cell size approximately corresponds to the spacing of the data in sparsely sampled regions (Rossi and Deutsch 2014). Table 1 shows the summary statistical parameters before and after de-clustering. For comparing the global statistics of the variables, the coefficient of variation (COV) is also provided. This unit-less measure of variability, which for all, is less than 2.5, indicating that one is not dealing with mixture of very high and very low values (Rossi and Deutsch 2014), and thus,

19
Co-simulated Size Number the separation of geological domains is not required, although As shows the highest COV among the variables. As can be seen from Table 1, the declustered mean for all the variables increased by 20.61% for As, 1.36% for Cu, 5.93% for Zn, 5.76% for Ni, 3.81% for Co and 0.35% for Mn. These significant ranges of variations indicate that the data are mostly clustered in low-valued areas (Rossi and Deutsch 2014).
Because the scope of this study is multivariate geostatistical modeling of variables, multivariate statistics of the variables are discussed. Correlation coefficients as popular measure for this quantify the global dependency that might exist among variables. The cross-correlation coefficient matrix for both clustered and de-clustered is provided in Table 2 (upper diagonal and lower diagonal, respectively). Note that there are strong correlations (i.e., above 50%) among Zn and Ni, Co, Mn, between pairs of Ni-Co and Co-Mn in both cases, indicating a high level of dependency in global relationship. All the variables also are positively correlated. Since the declustering weights are determined on the basis of geometric configuration, only one set of weights is computed for correlation coefficients (Rossi and Deutsch 2014).

LOG-RATIO TRANSFORMATION
The studied variables were transformed to log ratios for further analysis. Prior to this transformation, a fill-up variable was introduced to constrain the sum of components to 10 6 ppm. This variable represents the components in the samples that were not assayed. The descriptive statistical parameters of the compositional data may be uninformative (Aitchison 1986). In particular, statistical correlation of such data may be spurious and unrealistic (Pawlowsky-Glahn 1984;Pawlowsky-Glahn and Olea 2004). One of the reason for this relates to the Aitchison distance between two compositional

20
Madani and Carranza observations, Z a ðxÞ and Z b ðxÞ, which does not follow the typical Euclidean geometry (Aitchison 1986): In this respect, for inferring the central tendency of each component, Aitchison proposed the use of compositional center for a dataset of size n, regardless of the type of log-ratio transformation: Furthermore, Aitchison (1986) defined the dispersion measure of a compositional dataset that can be described by variation matrix T ¼ ½t ij :  21 Co-simulated Size Number This can also be an approximation to measure correlation between two components (Buccianti and Pawlowsky-Glahn 2005), although it cannot be seen as a typical correlation coefficient. The low values of this measure represent that all two side observations show about the same ratio, while high variability reflects substantially different ratios of the two parts among the observations (Filzmoser et al. 2009). Its normalized version exp( À t ij Þ transforms the values to the interval ½0; 1 (Buccianti and Pawlowsky-Glahn 2005). The compositional center (closed geometric mean) and the variation matrix for all six compositional dataset and one fill-up variable are presented in Table 3.

NORMAL SCORE TRANSFORMATION
In order to implement the geostatistical cosimulation algorithm, the alr-transformed variables (i.e., alr.As, alr.Cu, alr.Zn, alr.Ni, alr.Co and alr.Mn) should be transformed to standard Gaussian, with mean and variance of 0 and 1, respectively. The reason for Gaussian transformation relates to the multi-Gaussian distribution assumption that makes the inference of the conditional distribution in geo-statistical algorithms convenient (Goovaerts 1997). However, several variables in the geosciences rarely obey this assumption and rather show asymmetric distributions, positively or negatively skewed (Deutsch and Journel 1998;Van den Boogaart et al. 2017). The transformation can be carried out through Gaussian anamorphosis (Rivoirard 1994) or quantiles-based approach (Deutsch and Journel 1998). In multivariate cases, it is common practice to transform each variable separately to normal standard score and employ one of the widely used functions of Gaussian co-simulation. In this study, the quantiles-based approach was applied to transform the six cross-correlated variables to normal standard score. The normal score transformed variables are depicted in normal probability plots (Fig. 5). The straight-line plots confirm that the distributions of the transformed variables follow the normal standard (ordinate), compared to the standard Gaussian distribution (abscissa).
Univariate Gaussian characteristics of the variables for (co)-simulation are required but not sufficient to ensure that a random function is multivariate Gaussian (Chilè s and Delfiner 2012). This examination can be done by checking the experimental variogram of different order (w) over the transformed Gaussian variables. In this respect, modogram ( w ¼ 1) and rodogram ( w ¼ 1=2) can be defined as (Deutsch and Journel 1998): Modogram: Rodogram: where ½zðu a Þ À zðu a þ hÞ : is an h-increment of the variable z and NðhÞ is the number of pairs.
Once the two functions are computed, in order to show that the spatial bi-normality distribution assumption between transformed pairs is also honored, the experimental modograms and rodograms should be examined whether they are approximately proportional to the square roots of their experimental variograms (Emery 2005). This can be validated by plotting the former functions vs. the latter functions in a log-log coordinate. The application of this theory to the six cross-correlated variables shows that experimental points follow relatively the theoretical lines (Fig. 6), implying that the transformed variables are in satisfactory agreement with the bi-normality distribution assumption, and thus, the Gaussian (co)-simulation can be performed.
Experimental variograms were calculated for the transformed Gaussian variables in order to model their joint cross-correlation structures. This joint structure is required to solve the co-kriging system in co-simulation algorithm. The computation of direct variogram per variable in different azimuth directions (0°, 45°, 90°and 135°) revealed no significant geometric and zonal anisotropies, indicating that omni-directional variograms are sound enough for modeling such a joint spatial structure. In this regard, two spherical nested structures with ranges of 2000 and 8000, considering a nugget effect, were inferred through the linear model of co-regionalization. This fitting procedure was advocated due to its simplicity and versatility (Journel and Huijbregts 1978;Goovaerts 1997;Wackernagel 2003) and can be obtained by semi-automatic fitting algorithm (Emery 2010). For brevity, the outcomes of fitting for direct variograms are displayed in Figure 7, imposing the semi-definiteness condition, which guarantees the validity of the linear model of coregionalizations. The complete formula of variogram analysis including the linear model of co-regionalization fitted sills is given in Eq. (14). The main diagonal of the matrix relates to the fitted sills of direct variograms associated with upper and lower diagonals detailing the fitted sills of cross-variograms.
Once the variograms analysis was achieved, the co-simulation was implemented on a regular 2D grid of 50 m by 50 m. The TBCOSIM algorithm was used with a moving neighborhood in the shape of circle with radius of 5000 m, incorporating up to 200 sample points regardless of considering any quadrant for splitting the neighborhood circles to equal portions. The number of realizations and the turning

24
Madani and Carranza bands were set to 100 and 5000, respectively. Indeed, each of this number of realization mostly corresponds to the efficient characterization of the space of uncertainty (Goovaerts 1999) and should be based on minimum acceptance criteria for geostatistical simulation (Leuangthong et al. 2004). Abildin et al. (2019) showed that 100 realizations are an adequate number in co-simulation of cross-correlated variables for convergence of statistical parameters of the simulation results to the original de-clustered ones. The reason for arranging the latter number of turning bands to 5000 also relates to elimination of stripping effects that should be as large as possible (Emery 2008).
In the process of conditioning, as explained above, simple co-kriging is required to be considered. Multiple searching strategy was also selected, because Madani and Emery (2019) showed that this type of neighborhood is more precise and reliable. This strategy was performed in six parts, in which the neighborhood was searching for 200 closest data of the first, second, third, fourth, fifth and sixth variable in turn, independent of each other. Ordinary co-kriging can likewise be taken into account in the conditioning step, but in such a case the mean values of cross-correlated variables are uncertain (Emery 2007). In order to compare the results of joint simulation based on the methodology proposed (CoSS-N), independent TBSIM was also performed for each variable following Sadeghi et al. (2015) (i.e., SS-N) regardless of cross-correlation among the variables and compositional nature of the dataset. In this workflow, simple co-kriging in TBCOSIM was substituted for simple kriging taking into account the only normal score conditioning data that were obtained from Gaussian transformation of de-clustered original dataset. Then, direct variograms were computed on the normal score dataset and independent simulation (TBSIM) was implemented on each variable with the same number of realizations and turning bands as considered for TBCOSIM. The simulation results were then back-transformed to original scale. In TBCOSIM, however, we needed one extra back-transformation. In this case, the cosimulated realizations that keep the normal score distribution were back-transformed first and rescaled to alr-transformed variables and were then back-transformed to original scale (ppm) by Eq. 5. E-type maps, obtained by geometric averaging the back-transformed variable of interest across 100 realizations of TBCOSIM per block, are depicted in Figure 8. As can be seen, the results are compatible with the trend analysis already provided in the above section. For instance, As is concentrated in the left corner of the region and is decreasing toward the east. Another confirmation of the reproduction of dependency among the variables is related to the strong correlations between the pairs of Co-Mn (0.704), Ni-Co (0.712) and Zn-Mn (0.718) and relatively weaker correlations between the pairs of Zn-Ni (0.519) and Zn-Co (0.656), which can be visualized from the produced maps, implying that the TBCOSIM was capable of qualitatively reproducing the de-clustered correlations among the variables through all the realizations.

VALIDATION
In this section, it is of interest to validate the simulated results obtained from TBCOSIM and compare to the outputs of independent simulation, TBSIM. The scope of this work is mainly on the basis of reproduction of global and local correlation coefficient and its application to enhance characterization of geochemical anomalies. The CoSS-N model is proposed for compositional data as an innovative algorithm for showing the capability of Gaussian co-simulation to fit the latter purpose. In this regard, we decided to make this comparison (i.e., between TBSIM and TBCOSIM) over the conventional statistical parameters for several reasons. First of all, in this paper, the algorithm for TBSIM deems that the original dataset is non-compositional while neglecting the meaningful interrelationship among the variables (i.e., following the algorithm provided by Sadeghi et al. 2015 that is for univariate modeling). Second is the lack of practical guidelines in commercial software from the industrial perspective for dealing with modeling of compositional data (Tolosana-Delgado et al. 2018) and the corresponding statistical analysis, even after providing the monographs (Pawlowsky-Glahn and Olea 2004) despite enhanced development of logratio transformation techniques (Pawlowsky-Glahn 2003;Tolosana-Delgado 2006;Tolosana-Delgado and van den Boogaart 2013). Third, subject to the behavior of log-ratio transformed variables and their deviations from normal distribution, alternative examination for reproduction of histogram and variogram on both original scale and log-ratio transformed variables shows similar results (Mueller et al. 2014). Fourth, alr-transformed variables are no longer constrained by constant sum and common 25 Co-simulated Size Number geostatistical methods can be incorporated to derive the spatial relationships among the variables (Pawlowsky and Burger 1992). Fifth, there may be some sound arguments against using the original correlation and statistical parameters among the primary variables for decision-making that may lead to spurious decisions (Pawlowsky-Glahn and Egozcue 2016 and references therein). This reason is advocated, because the original variables are just a subgroup of a whole family known as compositional dataset, and the original de-clustered correlation still can be considered as an approximation of inherent dependency among the compositional data and may be examined as criteria for comparison between alternative geostatistical simulation methodologies. Sixth, there are several case studies based on geostatistical simulation of log-ratio transformation of original variables and contrasting to other methods that the validation parts are mainly based on conventional statistical parameters (e.g., Rubio et al. 2016;Van den Boogaart et al. 2017;Hosseini and Asghari 2019 and references therein).

Multivariate Statistical Parameters Reproduction
In order to validate the results, it is firstly of interest to check that the realizations reproduce the global dependency relationships among the variables. In this respect, one can compute the correlation coefficients between the back-transformed values of simulated variables in each realization (Eze et al. 2019). It is expected that these coefficients fluctuate around the experimental correlations as provided in Table 2, for instance 0.704 for Co-Mn, 0.712 for Ni-Co, 0.718 for Zn-Ni and 0.656 for Zn-Co. This situation occurs in the case of TBCOSIM, but not in TBSIM (Fig. 9). In the latter algorithm, the produced correlations through 100 realizations are, on average, lower than the expected de-clustered correlations. This bias can be explained because, compared to TBCOSIM, TBSIM does not consider the cross-dependency among variables, leading to poor reproduction of the cross-correlation among the simulated variables. The reproduced correlation coefficients are given in Table 4. Nevertheless, as mentioned earlier, there are some legitimate arguments concerning computation of Pearson correlation coefficients among compositional variables that may be deemed spurious. Therefore, in order to examine the interrelation dependency regardless of measure of correlation coefficient for comparison of two algorithms, the scatter plot between pairs of Zn-Ni, Co-Zn, Mn-Zn, Co-Ni and Mn-Co are depicted in Figure 10. In this plot, one can examine the reproduction of bivariate relation between the variables whether or not it follows the original de-clustered bivariate relation regardless of any correlation coefficient measures. As can be seen, the plots demonstrate that, compared to TBSIM (Fig. 10), not only is the reproduction of bivariate relations in TBCOSIM improved but also the bivariate relations are roughly in agreement with those obtained from correlation coefficient measures (Fig. 9, Table 4).
Another way to examine the quality of the simulation results is to check the reproduction of spatial continuity in the region, implying whether the local cross-correlation is reproduced well. Computation of direct and cross-variograms of the simulated Gaussian random fields is an option for this purpose. In this respect, spatial continuity measures of simulated realizations should converge, on average, to the fitted theoretical model, obtained either from linear modeling of co-regionalization or from any other fitting procedures (Emery 2004). If this variogram curve, on average, is not in agreement with the theoretical model, it indicates that the simulation algorithm was not appropriate for the studied dataset and leads to produce biased results. In this study, the results of spatial continuity analysis show that although the direct variograms are reproduced well by both TBCOSIM and TBSIM algorithms, the cross-variograms reproduction were not satisfactory for the TBSIM outputs (Fig. 11). The reason why the TBSIM results were biased is that the cross-dependency among the variables was not considered in this algorithm and one should not expect to reproduce the desired cross-correlation among the variables as provided in Table 5 for the lag separation 0.

Univariate Statistical Parameters Reproduction
It is instructive to examine the reproduction of the means and variances of the simulated realizations by TBCOSIM so as to compare with the original de-clustered means and variances of the b Figure 6. Experimental modograms and rodograms of the Gaussian variables as a function of their experimental variograms. In case of bi-normality, the data points should be approximately distributed along the thick solid line.
27 Co-simulated Size Number studied variables. While the mean is the acceptable measure for comparison of realizations, and even if the same central value is obtained as the declustered mean, the simulation results may show totally different distributions. Thus, the variance as additional measure of variation is required for better interpretation of the simulated data dispersion. Consequently, the mean and variance were calculated per set of realizations. The means and variances of 100 conditional simulated realizations are presented in Table 4. The reproduced means are slightly lower, varying from 0.60 to 11.33% depending on the variable, compared to the declustered means, and thus are quite satisfactory for all the six variables. Likewise, the reproduced variances are slightly lower, varying from 5.66 to 25.5% depending on the variable, compared to the declustered variances, and thus are quite satisfactory. The slightly lower means and variances can be interpreted as due to the influence of conditioning data (sample values). The simulation results were also checked thoroughly per variable, before backtransformation, to examine whether the realizations follow a normal standard distribution.

Validation Against Actual Data
The simulated results can also be examined against the actual values of the studied variables. For instance, more samples may be available from some locations as actual data in further steps of mineral exploration that those areas have already been simulated beforehand and some values are available as prediction data. However, in this study, the true values are not available. Therefore, in order to test the proficiency of the co-simulation results, a typical validation technique, which is suitable for validation of geostatistical algorithms, is taken into account for the six studied variables. This test, the so-called crossvalidation technique, involves removing one sample at a time and is re-estimated from the remaining neighborhood conditioning data. Each datum is then replaced in the data set once it has been re-estimated (Deutsch and Journel 1998). This type of analysis is basically applicable for evaluation of (a) variogram models, (b) type of kriging and (c) the search strategy in geostatistical contexts. Nevertheless, the crossvalidation technique is also be applicable for evaluation of (co)-simulation results. To do so, TBCOSIM was applied following the procedures that were thoroughly discussed in the previous sections. Because the ''true'' values of each variable were measured at sample locations, the averages of simulation results at the same locations conditioned to the remaining data provide the prediction values per variable at the corresponding locations. Therefore, a scatter plot can be drawn though the predicted and measured values at sample points. The results are depicted in Figure 12, and the bivariate distributions of plots and their tendencies to the diagonal line (black line) reveal that TBCOSIM provides satisfactory outputs without loss of generality. This also corroborates that this methodology (TBCOSIM) is precise enough in the case of cross-correlated variables whenever the actual data are available.

Co-simulated Size Number (CoSS-N)
As already discussed, the results of Gaussian co-simulation turn out more reliable outputs due to taking into account the cross-correlation structure among the variables of interest. This concept cannot be corroborated, however, in the case of dealing with independent simulation algorithm for modeling those types of variables with substantial inter-dependency relationship. In this section, it is of interest to capture different thresholds through fractal analysis of TBCOSIM simulated realizations (100 outputs) over the underlying N-S log-log plots. To do so, separate curves were first obtained from each individual realization, and then, the average of all log-log curves was calculated for each variable. Next, the segment lines have been fitted through this average curve graph of realizations and the thresholds were inferred as shown in Figure 11. The corresponding thresholds derived from fractal analysis in Figure 13 are also presented in Table 6.

PROBABILISTIC DESCRIPTION OF ANOMALOUS AREA
Once the thresholds are derived from fractal analysis on the corresponding realizations obtained from TBCOSIM, one application pertains to the production of maps showing anomalous areas for b Figure 7. Sample (crosses) and modeled (solid lines) direct variograms of Gaussian variables. For brevity, only the direct variograms are displayed, although the fitting has been carried out with all the direct and cross-variograms.
29 Co-simulated Size Number

30
Madani and Carranza further mineral exploration. This step is crucial, because it makes the investigation more up-scaled and narrower. Carranza (2010a, b) applied principal component analysis (PCA) to explain as much information contained in the data as possible in a few components as possible for Aroroy dataset. PCA searches for the direction in the multivariate space that contains the maximum variability. The two first perpendicular principle components (PCA1) and (PCA2) convey the maximum amount of the remaining data variability (Reimann et al. 2008). PCA can also be seen as a decomposition of the covariance matrix or correlation matrix into its ''eigenvectors'' and ''eigenvalues'' that are the loadings of the principle components spanning the new PCA coordinate system. In his study, Carranza (2010a, b) implemented the PCA analysis on the same case study dataset, indicating that PCA1 accounts for 34% of the total variance and represents a Co-Zn-Mn association, for which it shows possible metal scavenging by Mn oxides in the drainage environments in most part of area, while the PCA2 accounting for 22% of total variance introduces a Ni-Cu-As association with possibly enrichment of epithermal Au deposit in the area due to the weathering and erosion of anomalous sources containing sulfide minerals such as chalcopyrite and arsenopyrite. This association also reflects dacitic/ andesitic rocks that host the epithermal Au deposits in the area. In this study, we integrated the two mentioned associations (Co-Zn-Mn and Ni-Cu-As) to produce the probability maps, offering an alternative tool to show the uncertainty of finding probable anomalous areas in the region. In the previous section, the individual maps for each variable obtained through averaging the realizations were presented (Fig. 8). These maps can be integrated to produce the probability maps in the region, based on the corresponding thresholds. For instance, it is of interest to define the probable areas in one unique map instead of three different maps that jointly illustrate the anomalous areas for association of Ni-Cu-As taking into account the thresholds that identify the anomalous area for each individual variable. Based on Table 6, the initiation thresholds for Ni, Cu and As are 19.95 ppm, 39.8 ppm and 2.51 ppm, respectively. Therefore, the idea is to find the probability of common areas based on these variables and, thus, to prepare one map summarizing two or three variables. This is of paramount importance for delineation of areas with possible locations of epithermal Au deposits described by these Ni-Cu-As and Co-Zn-Mn associations. To do so, this concept can be formulated based on probability rules (e.g., Hogg et al. 2012;Olea et al. 2016): For two variables, A and B: For three variables, A, B and C: Therefore: For likely epithermal Au deposit locations defined by Cu and As variables: Prob Cu ! 39:8 \ As ! 2:51 ð Þ ¼ Prob Cu ! 39:8 \ As ! 2:51 ð Þ Â Prob As ! 2:51 ð Þ ð 17Þ b Figure 9. Correlation coefficients between transformed simulated variables with turning bands co-simulation (left) and TBSIM (right). For brevity, only the correlation between the pairs of Zn-Ni, Co-Zn, Mn-Zn, Co-Ni and Mn-Co is illustrated. The average correlation over the realizations is represented with a red line, while the true de-clustered correlation is represented with a green line.    Figure 13. Results of fractal method and the thresholds identified for the CoSS-N model.

36
Madani and Carranza The probability maps are shown in Figure 14. They are constructed based on Eqs. 15,16,17,18 and 19 for each block through 100 conditional TBCOSIM realizations and show the chance of finding plausible mineralized zones based on the identified thresholds. The sectors with little uncertainty are those associated with a high probability for a given anomaly (portrayed in red in Fig. 13), indicating that there is little risk of not finding an anomaly, or those associated with a very low probability (portrayed in dark blue in Fig. 13), indicating that one is pretty sure of not finding an anomaly, while the other sectors (portrayed in light blue, green, or yellow in Fig. 13) are more uncertain. The footprint of plausible anomalies for enrichment of Cu and As elements (Fig. 14a) shows that the likely locations where epithermal Au deposits can be found trend NW-SE, which is in agreement with the trend of their host rocks (Fig. 14b).

CONCLUSION
Identification of geochemical anomalies is important for mineral exploration. However, deriving the corresponding elemental thresholds from fractal/multi-fractal analysis is challenging whenever the geochemical variables exhibit dependency among each other and they have the inherent compositional characteristics. In this study, an innovative workflow is proposed for tackling these issues based on a combination of geostatistical Gaussian co-simulation and fractal analysis, namely co-simulated size number (CoSS-N). In this context, the geochemical components should be transformed from original scale to new variables by one of the log-ratio transformation techniques, such as additive log-ratio transformation (alr), after inference of statistical parameters in both original and log-ratio scales, respectively. In order to quantify spatial continuity, direct and cross-variogram can be calculated over alr-transformed variables and linear model of coregionalization should be fitted respecting semidefinite positive condition. Once the variogram structure is derived, one of the Gaussian co-simulation algorithms, such as turning bands co-simulation (TBCOSIM), can be implemented. The log-ratio simulated results should then be back-transformed to original scale and fractal analysis can then be applied for calculation of thresholds for differentiating of background from anomaly. The proposed approach was performed in a real case study from the Philippines where seven cross-correlated variables were measured at 135 sample points from the surficial environment. In order to validate the proposed algorithm, CoSS-N results were compared with the results of SS-N (Sadeghi et al. 2015), whereby global and local statistical parameters were examined on the simulation results obtained from both approaches. The overall accuracy of the CoSS-N corroborates that it outperforms the SS-N model in the case of dealing with cross-correlated geochemical components. The derived thresholds were then applied to calculate the uncertainty and plausible areas where to find epithermal Au deposits in the region. The probabilistic produced maps showed a NW-SE trend of locations where possible epithermal Au deposit can be found in the region. These results have not been recognized from the geometric average of back-transformed simulation results. In future, this study can be extended based on identification of thresholds by fractal analysis of cross-correlated variables, wherever the underlying