Astronomy
&Astrophysics

A&A, 689, A282 (2024)
https://doi.org/10.1051/0004-6361/202449791
© The Authors 2024

Cluster membership analysis with supervised learning
and N-body simulations

A. Bissekenov1,2 , M. Kalambay3,4,2,5,6 , E. Abdikamalov7,2 , X. Pang 1,8 ,
P. Berczik 9,10,11,5 , and B. Shukirgaliyev3,4,7,⋆

1 Department of Physics, Xi’an Jiaotong-Liverpool University, 111 Ren’ai Road, Dushu Lake Science and Education Innovation
District, Suzhou 215123, Jiangsu Province, PR China

2 Energetic Cosmos Laboratory, Nazarbayev University, 53 Kabanbay Batyr Ave., 010000 Astana, Kazakhstan
3 Heriot-Watt University Aktobe Campus, 263 Zhubanov Brothers Str, 030000 Aktobe, Kazakhstan
4 Heriot-Watt International Faculty, K. Zhubanov Aktobe Regional University, 263 Zhubanov Brothers Str, 030000 Aktobe,

Kazakhstan
5 Fesenkov Astrophysical Institute, 23 Observatory Str., 050020 Almaty, Kazakhstan
6 Faculty of Physics and Technology, Al-Farabi Kazakh National University, 71 Al-Farabi Ave, 050020 Almaty, Kazakhstan
7 Department of Physics, School of Sciences and Humanities, Nazarbayev University, 53 Kabanbay Batyr Ave., 010000 Astana,

Kazakhstan
8 Shanghai Key Laboratory for Astrophysics, Shanghai Normal University, 100 Guilin Road, Shanghai 200234, PR China
9 Nicolaus Copernicus Astronomical Centre Polish Academy of Sciences, ul. Bartycka 18, 00-716 Warsaw, Poland

10 Konkoly Observatory, HUN-REN Research Centre for Astronomy and Earth Sciences, Konkoly Thege Miklós út
15–17, 1121 Budapest, Hungary

11 Main Astronomical Observatory, National Academy of Sciences of Ukraine, 27 Akademika Zabolotnoho St., 03143 Kyiv,
Ukraine

Received 29 February 2024 / Accepted 27 July 2024

ABSTRACT

Context. Membership analysis is an important tool for studying star clusters. There are various approaches to membership determina-
tion, including supervised and unsupervised machine-learning (ML) methods.
Aims. We perform membership analysis using the supervised ML approach.
Methods. We trained and tested our ML models on two sets of star cluster data: snapshots from N-body simulations, and 21 different
clusters from the Gaia Data Release 3 data.
Results. We explored five different ML models: random forest (RF), decision trees, support vector machines, feed-forward neural
networks, and K-nearest neighbors. We find that all models produce similar results, and the accuracy of RF is slightly better. We find
that a balance of classes in the datasets is optional for a successful learning. The classification accuracy strongly depends on the astro-
metric parameters. The addition of photometric parameters does not improve the performance. We find no strong correlation between
the classification accuracy and the cluster age, mass, and half-mass radius. At the same time, models trained on clusters with a larger
number of members generally produce better results.

Key words. methods: data analysis – methods: numerical – Galaxy: kinematics and dynamics –
open clusters and associations: general – solar neighborhood

1. Introduction
The research of open clusters (OC) is essential in many areas of
astronomy, including the formation and evolution of stars, Galac-
tic dynamics, and star formation history (Lada & Lada 2003;
de la Fuente Marcos & de la Fuente Marcos 2004; Krumholz
et al. 2019). The reason is that the stars in a cluster tend to
have the same age and kinematics (Portegies Zwart et al. 2010;
Renaud 2018; Krumholz et al. 2019). To better estimate the fun-
damental parameters of star clusters such as age, mass, size,
and metallicity, it is crucial to separate stars that are members
from those that are in the field of the Galaxy (Kharchenko et al.
2005, 2012). This process is usually referred to as membership
analysis.

Identifying member stars within OCs presents a signifi-
cant challenge due to their location within the Galactic disk

⋆ Corresponding author; b.shukirgaliyev@hw.ac.uk

(Ascenso et al. 2009; Kharchenko et al. 2012). The subtle over-
density generated by these clusters often becomes obscured
by field stars (Bland-Hawthorn & Gerhard 2016). Historically,
membership analysis relied heavily on manual methods (e.g.
Stock 1956; Ruprecht et al. 1981; Phelps & Janes 1994; Chen
et al. 2003; Kharchenko et al. 2005, 2012; Dias et al. 2014;
Conrad et al. 2017; Röser et al. 2019; Meingast & Alves
2019; Lodieu et al. 2019). However, recent advancements in
machine-learning (ML) and large datasets provided by Gaia
Data Releases (Gaia Collaboration 2016a,b, 2018, 2021, 2023)
have revolutionized this process (Olivares et al. 2023). Gen-
erally, supervised and unsupervised ML methods can be used
(Sindhu Meena & Suriya 2020). While the supervised approach
is based on training on labeled data, the unsupervised method
does not rely on labeled data (Sindhu Meena & Suriya 2020).
Numerous studies now leverage unsupervised learning tech-
niques to automate and enhance the determination of cluster

A282, page 1 of 10
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0),

which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

https://www.aanda.org
https://doi.org/10.1051/0004-6361/202449791
https://orcid.org/0009-0003-4608-2611
https://orcid.org/0000-0002-0570-7270
https://orcid.org/0000-0001-5481-7727
https://orcid.org/0000-0003-3389-2263
https://orcid.org/0000-0003-4176-152X
https://orcid.org/0000-0002-4601-7065
mailto:b.shukirgaliyev@hw.ac.uk
https://www.edpsciences.org/en/
https://creativecommons.org/licenses/by/4.0
https://www.aanda.org/subscribe-to-open-faqs
mailto:subscribers@edpsciences.org


Bissekenov, A., et al.: A&A, 689, A282 (2024)

membership (Gao 2014; Cantat-Gaudin et al. 2018; Gao 2018b;
Tang et al. 2019; Liu & Pang 2019; Castro-Ginard et al. 2020;
Agarwal et al. 2021; Noormohammadi et al. 2023; Hunt &
Reffert 2023, and more).

There are several types of unsupervised learning methods
(Bishop 2006), including clustering, self-organized map algo-
rithms, and probabilistic models (Bação et al. 2005). Among the
clustering methods, the density-based algorithms have gained
wide use (Gao 2014; Tucio et al. 2023; Ghosh et al. 2022). There
are two versions of this method: a density-based spatial clus-
tering of applications with noise (DBSCAN, Ester et al. 1996),
and a hierarchical density-based spatial clustering of applica-
tions with noise (HDBSCAN, Campello et al. 2013). DBSCAN
works by assigning (i.e., using hyperparameters) the radius and
the minimum number of stars (Hunt & Reffert 2021). When the
numbers of actual stars exceed these values, this region is clas-
sified as a cluster (Gao 2014). HDBSCAN works similarly but
extends DBSCAN by constructing a cluster hierarchy based on
minimum spanning trees (Hunt & Reffert 2021). This allows the
handling of clusters of varying shapes and densities (Campello
et al. 2013). Gao (2014) used DBSCAN on the 3D kinematic
features of the NGC 188 cluster and identified 1504 member
star candidates. Tucio et al. (2023) and Ghosh et al. (2022) used
HDBSCAN to separate the member stars of NGC 2682 and NGC
7789 from field stars. The code UPMASK combines principle
component analysis and clustering algorithms such as the k-
means clustering implemented in the R language (Krone-Martins
& Moitinho 2015). pyUPMASK is a Python implementation of
the same method (Pera et al. 2021). StarGO (Yuan et al. 2018)
is an unsupervised self-organized map algorithm that was ini-
tially built to find halo stars in the Galaxy. Tang et al. (2019)
used it for dimensionality reduction and membership determi-
nation by using 5D kinematic data of stars. Later, this technique
was used to find new OCs in the solar neighborhood (Liu & Pang
2019; Pang et al. 2020, 2022). Additionally, a probabilistic model
such as the Gaussian mixture model (GMM) was used by Jaehnig
et al. (2021), who applied it to 426 OCs.

Several attempts have also been made to use supervised
learning in a membership analysis. A combination of unsuper-
vised learning with supervised model methods was employed,
such as random forest (RF) with GMM (Gao 2018a, 2019a,b;
Mahmudunnobe et al. 2021; Jadhav et al. 2021; Das et al. 2023;
Guido et al. 2023) and K-nearest neighbors (KNN) with GMM
(Agarwal et al. 2021; Deb et al. 2022). As for the supervised
learning-only approach, van Groeningen et al. (2023) used deep
sets neural networks on 167 OCs from the Gaia DR2 and eDR3
data based on the catalog of Cantat-Gaudin et al. (2020).

However, the membership analyses performed using differ-
ent methods in previous studies show only a limited agreement
with each other. For example, Bouma et al. (2021) reported that
for OC NGC 2516, only 25% of the labels by Kounkel & Covey
(2019), 41% of the labels by Meingast et al. (2021), and 68%
of the labels by Cantat-Gaudin et al. (2018) overlap with each
other.

In this work, we perform a membership analysis using super-
vised learning on data from simulations and observations for
training and testing. For the former, we use the N-body simula-
tions from Shukirgaliyev et al. (2021). The advantage of using
simulation data is that we can determine the memberships of
stars based on the physical conditions in the simulation. We use
the Gaia DR3 data as the observational data and label the cluster
members based on the findings of Liu & Pang (2019) and Pang
et al. (2022).

Table 1. Sets of feature combinations.

Set Features

1 α, δ, µα, µδ, π – – –
2 α, δ, – – – m, GBP −GRP –
3 α, δ, µα, µδ, – m, GBP −GRP –
4 α, δ, µα, µδ, π, – – υr
5 α, δ, µα, µδ, π, m, GBP −GRP –

This paper is organized as follows. Section 2 describes the
method. Section 3 presents our results, and Sect. 4 provides our
conclusions.

2. Method

2.1. Data

We trained our ML models using data from N-body simulations
and Gaia DR3 observations. When we used the simulation data,
we considered eight observable parameters to describe the stars.
These are the five astrometric parameters right ascension α, dec-
lination δ, two proper motions along these directions µα and µδ,
the parallax π, and two photometric parameters: the apparent
magnitude mG and the color index GBP −GRP in the Gaia bands
(Maíz Apellániz & Weiler 2018), and the radial velocity υr. To
study the impact of these parameters, we explored different com-
binations of these parameters in our training and testing. For our
default combination, which we call combination 1, we used a
five-parameter family (α, δ, µα, µδ, and π). For parameter com-
bination 2, we used α, δ, m, and GBP − GRP. For combinations
3, 4, and 5, we used α, δ, µα, µδ, m, and GBP − GRP, α, δ, µα,
µδ, π, and υr, and α, δ, µα, µδ, π, m, and GBP −GRP. Not all of
these data combinations are available in the Gaia catalog. For
this reason, when we tested and trained using the Gaia data, we
only used the default combination (combination 1; see Table 1
for a summary).

2.1.1. N-body simulations

Our study is partially based on applications of supervised learn-
ing on N-body simulation data that cover the dynamic evolution
from the gas expulsion to complete dissolution (Shukirgaliyev
et al. 2017, 2021). The initial conditions are based on the
Parmentier & Pfalzner (2013) model. We use the N-body
simulation models of Shukirgaliyev et al. (2021) with Plummer
density profile at the time of instantaneous gas expulsion.
These are clusters of 104 stars that formed with three different
global star formation efficiencies (SFEs) of 17%, 20%, and
25%. Shukirgaliyev et al. (2021) performed nine simulations
per model with different randomization. We only use two
randomizations of the position and mass labeled 11 and 22 by
Shukirgaliyev et al. (2021). We did not use model clusters with
the lowest SFE of 15% because they dissolve soon: Their star
count in the aftermath of gas expulsion is low.

N-body simulations use Cartesian coordinates with an origin
at the Galactic center (Shukirgaliyev et al. 2019). We then place
the cluster at a distance of 150 pc from the Sun, as shown by
Kalambay et al. (2022). We transfer the data to equatorial (Inter-
national Celestial Reference System, ICRS) coordinates using
the Astropy package (Astropy Collaboration 2013) to derive the

A282, page 2 of 10


Bissekenov, A., et al.: A&A, 689, A282 (2024)

five astrometric parameters α, δ, µα, µδ, and π. Using effective
temperature, mass, metallicity, and luminosity of each star avail-
able in N-body simulations snapshots, we calculate the absolute
magnitude corresponding to the Gaia DR2 (G, GBP, GRP, Maíz
Apellániz & Weiler 2018) bands using the method described in
Chen et al. (2019). Finally, with the knowledge of the parallaxes
of all stars and their G, we calculate their apparent magnitudes
(mG) using

mG = G − 5(log π + 1). (1)

We excluded dim stars with mG < 21 mag in the Gaia DR2
catalog from our analysis (Gaia Collaboration 2018).

To make our simulation data resemble actual astronomical
observations, we generated field stars of the Galactic plane with
the code Galaxia to produce a synthetic galaxy model (Sharma
et al. 2011). We used the modified version by Rybizki (2019) to
fit Gaia DR2. We then placed our cluster inside these stars. We
labeled stars within the Jacobi radii (Just et al. 2009) as member
stars, and those outside this radius were labeled as non-members.
These nonmembers were member stars in earlier N-body simu-
lation snapshots. The field stars obtained from Galaxia were
also labeled nonmembers, but some may occur within the Jacobi
radius of the cluster.

We considered a circular field of view around the cluster cen-
ter that covered about 600 square degrees. This region included
types of stars of the simulated cluster that are gravitationally
bound and those that are unbound. We removed the tidal tail
stars that are located outside this region. Nevertheless, some of
the tail stars may appear in front of or behind the cluster by
being projected onto the celestial sphere. Before applying ML,
we moved the origin of our equatorial coordinate system to the
cluster center. We determined the cluster center using the obser-
vational coordinates and proper motions using an iterative search
of the density center. In this algorithm, the average coordinates
(α and δ in our case) of all stars in our set were determined,
and we shifted the coordinate origin to this location. Then, we
repeated this process, considering stars within 80% of the max-
imum radius in the previous set until we reached the minimum
threshold of 150 stars.

2.1.2. Gaia data

In addition to using data from N-body simulations, we also
performed training and testing on clusters from the Gaia DR3
data (Gaia Collaboration 2023). From now on, we refer to Gaia
DR3 as Gaia data. We trained three supervised ML models on
Blanco 1, the Pleiades, and NGC 2516. We selected 21 clus-
ters for testing: Blanco 1, Collinder 69, Huluwa 3, IC 4756, LP
2373 gp2, LP 2373 gp4, LP 2383, LP 2442, Mamajek 4, NGC
1980, NGC 2422, NGC 2451B, NGC 2516, NGC 3532, NGC
6405, NGC 6475, NGC 6633, the Pleiades, Praesepe, Stephen-
son 1, and UBC 31. We selected these clusters from the Pang
et al. (2022) catalog. These clusters are classified by five mor-
phological categories: filaments, fractals, halo, tidal tail, and
unspecified (i.e., without clear morphological features outside
one tidal radius). The 21 clusters that we selected contain clus-
ters that belong to each of these categories (four filamentary
clusters, four fractal clusters, three halo clusters, five tidal tail
clusters, and five unspecified clusters). The data were retrieved
from Astropy queries with a full inclusion of the field stars. The
membership labeling data were taken from Pang et al. (2022).
Since the member stars from Pang et al. (2022) are incomplete
for mG < 21, we again excluded the dim stars with mG < 21 in the

Gaia DR3 data from our analysis. This magnitude cut removed
stars below 0.3 solar mass (Pang et al. 2024), but it does not
affect our scientific motivation in this work.

2.2. Machine-learning

We tested five ML algorithms: KNN (Cover & Hart 1967),
decision trees (DT, Breiman et al. 1984), RF (Breiman 2001),
feed-forward neural network (FFNN, Bebis & Georgiopoulos
1994), and support vector machines (SVM, Cortes & Vapnik
1995).

The KNN is a simple, effective, and widely used ML model.
It compares the distances of the KNN to a given sample for
making predictions (Peterson 2009). The algorithm performance
depends on K. Previously, the algorithm was used on the Gaia
data in combination with the GMMs for membership identi-
fication (Agarwal et al. 2021). We used K = 5. The distance
between neighbors is measured in terms of Minkowski distance.
The probability of a cluster membership was calculated based on
the distance weight function.

The supervised ML algorithm DT recursively splits the data
based on features to create a tree-like structure (Fürnkranz 2010).
We used a DT classifier from Scikit-learn (Pedregosa et al. 2011)
with default parameters. To the best of our knowledge, this model
has not yet been used in studies of OCs.

The ensemble learning ML algorithm RF uses multi-
ple decision-tree predictors for classification and regression
problems (Breiman 2001). This method was used in several
studies that worked with Gaia data (Gao 2018a,b, 2019a;
Mahmudunnobe et al. 2021; Das et al. 2023). We configured
the model with 100 trees, a gini tree-split criterion, and did not
specify the depth of the trees. We used an RF classifier from the
Scikit-Learn library (Pedregosa et al. 2011).

The FFNN is the supervised ML model and is considered
part of deep learning because the model depth depends on the
number of layers (Bebis & Georgiopoulos 1994). The selection
of an appropriate hyperparameters such as activation functions,
optimization functions, learning rate, criterion, batch size, num-
ber of epochs, and number of layers is crucial for successful
learning. We used five-layer neural networks with a batch size
of 400 with activation functions such as LeakyReLU, Sigmoid,
and GeLU and trained on 20 epochs with an Adam optimizer
and the cross-entropy loss criterion. We used the PyTorch library
(Paszke et al. 2019) to implement FFNN.

The SVM is a supervised ML algorithm that finds the opti-
mal hyperplane that best separates different classes in the input
feature space (Cortes & Vapnik 1995). To our knowledge, it was
not used in previous studies on OC membership. We configured
SVM with default hyperparameters from the Scikit-learn library
(Pedregosa et al. 2011).

For the evaluation criteria, we used the confusion matrix,
which is a matrix that shows the number of true positives (TP),
true negatives (TN), false positives (FP), and false negatives
(FN) of binary classification, as shown in Table 2. The diago-
nal part of the matrix is the correctly classified samples that are
TN and TP (Kohl 2012). We measured the accuracy in terms of
the F1 score accuracy (Goutte & Gaussier 2005),

F1 =
2 × precision × recall

precision + recall
, (2)

where

precision =
TP

TP + FP
, recall =

TP
TP + FN

. (3)

A282, page 3 of 10


Bissekenov, A., et al.: A&A, 689, A282 (2024)

Table 2. Confusion matrix representing the performance of the
prediction compared to the actual value.

Prediction

Non-Member Member

Actual value Non-Member TN FP
Member FN TP

Fig. 1. Stars from N-body simulation with an SFE of 17% in a 20 Myr
snapshot with generated field stars projected onto the sky. The blue,
orange, and gray points show the member, nonmember, and Galactic
field stars.

We trained our ML models on data from snapshots of
N-body simulations and the Gaia DR3 data. The performance
was not improved when a few snapshots were combined. For the
N-body simulation data, we considered snapshots with differ-
ent SFEs in different time frames. For the default case, we used
the snapshot at the end of the violent relaxation at 20 Myr of the
model cluster with an SFE of 17%. We also considered snapshots
at 100 Myr and 500 Myr, as discussed below.

For the testing, we used snapshots from simulations with
different SFEs. These simulations included a randomization for
positions and masses that were different from those of the train-
ing set. When we added field stars, we used snapshots at 100 Myr
and 1 Gyr. An example testing set of model clusters with an SFE
of 17% in a 20 Myr snapshot combined with field stars is shown
in Fig. 1.

3. Results

All the results shown below were obtained using the RF method.
The dependence on the ML methods is described in Appendix A.

3.1. Test on N-body simulation data

We first applied ML to N-body simulation data. For our first
training, we selected a snapshot at 20 Myr for a simulation
with an SFE of 17%. This snapshot corresponds to the system
equilibrium phase after violent relaxation.

The upper panel of Fig. 2 shows the F1 score as a func-
tion of time for different testing sets. We first discuss the case

102 103

70

75

80

85

90

95

100

F
1

sc
or

e
[%

]

17% SFE

20% SFE

25% SFE

102 103

Time [Myr]

1

2

3

4

5

r h
[p

c]

rh
mj

0

500

1000

1500

2000

2500

3000

3500

m
j

[M
�

]

Fig. 2. F1 score of ML models tested on N-body simulation data with
different SFEs as a function of time (top panel). The half-mass radius
(solid lines) and Jacobi mass (dashed lines) are shown in the bottom
panel.

with an SFE of 17%, shown in black. In the initial phase (t ≲
200 Myr), the accuracy is ≈95%. It starts to decrease with time at
t ≈ 200 Myr and reaches ≈77.5% at t ≈ 1 Gyr. The drop in accu-
racy is correlated with the size and mass of the cluster, which
shrink over time, as shown in the bottom panel of Fig. 2. Most
of the mistakes in later snapshots are due to an increase in FP
classifications.

The classification accuracy for simulations with a higher SFE
is qualitatively similar to an SFE of 17%, but has some modest
quantitative differences. The red and blue lines show the F1 score
accuracy as a function of time for tests with an SFE of 20% and
SFE of 25%. At t ≲ 200 Myr, the accuracies are ≈98.0% and
≈97.5 for an SFE of 20% and an SFE of 25%, respectively. The
F1 scores start to decrease with time at t ≈ 1 Gyr and t ≈ 2 Gyr
for an SFE of 20% and an SFE of 25%, respectively. The lowest
value for an SFE of 20% is F1 ≈ 68% at t ≈ 1.557 Gyr, and for
an SFE of 25%, it is F1 ≈ 83% at t ≈ 3 Gyr.

Next, we explored the dependence of training sets on the
time and SFE of N-body simulation snapshots. We trained nine
models on snapshots of three model clusters with different SFEs
(17, 20, and 25%) at three times (20, 100, and 500 Myr). We
tested each model on 1096 snapshots with different times and
SFEs representing the full snapshots of all N-body simulations
we used. Figure 3 shows the F1 score as box plots for differ-
ent values of t and SFE. The center of the box corresponds to
the median F1 score value, while the upper and lower bound-
aries correspond to 25% quantiles. The error bars represent the

A282, page 4 of 10


Bissekenov, A., et al.: A&A, 689, A282 (2024)

17 20 25
SFE [%]

60

65

70

75

80

85

90

95

100

F
1

sc
or

e
[%

]

20 Myr

100 Myr

500 Myr

Fig. 3. Box plots of F1 score as a function of SFE for different snap-
shots. The x-axis shows the SFE value of the training set. The color
represents the snapshot time of the training sets. Each box plot shows
the median, quantiles, minimum, and maximum values of the classifi-
cation results on more than 1096 synthetic cluster datasets.

Table 3. Classification results of the RF model with the Galactic field
stars.

SFE Time F1 TP TN FP FN
(%) (Myr) (%) # # # #

17 100 96.3 2046 11 114 147 87 74
17 1000 76.4 99 11 419 272 63 1
20 100 96.6 3311 11 024 986 23 214
20 1000 93.8 650 10 976 115 89 0
25 100 92.5 5691 10 660 405 4 920
25 1000 98.4 3321 10 822 245 13 98

maximum and minimum values. Overall, the differences in accu-
racies between different training sets are minor. The average
accuracy varies between F1 ≈ 95% to F1 ≈ 97%. The upper
quantiles are between ≈97% and ≈ 99%, and the maximum
values are ≈99%. The lower quantiles are between 89.0% and
≈93.5%, and the minimum values are between 63% and 73%.
We also trained our models on a combination of up to ten snap-
shots, but the resulting accuracy was not higher than that for
models trained with only one snapshot.

Next, we explored the impact of the stellar parameters on the
classification accuracy. As described above, we used five differ-
ent parameter combinations (see Table 1). Figure 4 shows the
F1 score for these five combinations. Overall, the accuracy is not
very sensitive to the combination of parameters used for training.
The five-parameter combination (α, δ, µα, µδ, and π) exhibits
the highest median accuracy of ≈96.5%. The upper and lower
quantiles are ≈92.4% and ≈98.7%. The four-parameter com-
bination (α, δ, m, and GBP − GRP) exhibits the lowest median
accuracy of ≈88.8%. The upper and lower quantiles are ≈82.0%
and ≈92.4%. The results of the remaining combinations are sim-
ilar to the result of the five-parameter combination (α, δ, µα,
µδ, and π) in medians, quantiles, and minimum and maximum
values.

We finally explored the impact of field stars. We added field
stars as nonmembers to the training set at 20 Myr. We then tested

60

70

80

90

100

F
1

sc
or

e
[%

] α
,δ
,µ

α
,µ

δ
,π

α
,δ
,m

,G
B
P
−
G
R
P

α
,δ
,µ

α
,µ

δ
,m

,G
B
P
−
G
R
P

α
,δ
,µ

α
,µ

δ
,π
,v

r

α
,δ
,µ

α
,µ

δ
,π
,m

,G
B
P
−
G
R
P

Fig. 4. Box plots of F1 accuracy score for the five different parameter
combinations used in the training set.

0 20 40 60 80

F1 score [%]

UBC 31

Stephenson 1

Praesepe

Pleiades

NGC 6633

NGC 6475

NGC 6405

NGC 3532

NGC 2516

NGC 2451B

NGC 2422

NGC 1980

Mamajek 4

LP 2442

LP 2383

LP 2373 gp4

LP 2373 gp2

IC 4756

Huluwa 3

Collinder 69

Blanco 1

Blanco 1

Pleiades

N-body

NGC 2516

Fig. 5. F1 score accuracies for 21 different clusters. Instances with the
same training and testing set are dropped.

the model on 100 Myr and 1 Gyr snapshots. The corresponding
F1 scores are shown as crosses in the top panel of Fig. 2. The
inclusion of the field does not affect the F1 score significantly.
The accuracy of the test at t = 1 Gyr matches that of the test
without field stars. For the tests at 100 Myr, all the accuracies are
slightly lower than for the results without field stars. For SFE of
17% and SFE of 20%, the accuracies are 1.1% and 2.0% lower.
For SFE of 25%, the drop in accuracy is 5.5%. The values are
provided in Table 3.

3.2. Tests on observational data

We applied our ML approach to the clusters from the Gaia data.
As mentioned, we trained on four different models (default set
of N-body simulation, Blanco 1, Pleiades, and the NGC 2516

A282, page 5 of 10


Bissekenov, A., et al.: A&A, 689, A282 (2024)

0 200 400 600 800 1000
Age [Myr]

0

20

40

60

80

F
1

sc
or

e
[%

] Blanco 1

Pleiades

NGC 2516

N-body

500 1000 1500 2000
Total mass [M�]

0

20

40

60

80

5.0 7.5 10.0 12.5 15.0 17.5
rh [pc]

0

20

40

60

80

−10010
α [o]

−45

−40

−35

−30

−25

−20

δ
[o

]

−10010
α [o]

−45

−40

−35

−30

−25

−20

δ
[o

]

15.0 17.5 20.0 22.5
µα [mas/yr]

0

2

4

6

µ
δ

[m
as

/y
r]

15 20 25
µα [mas/yr]

−2.5

0.0

2.5

5.0

7.5

µ
δ

[m
as

/y
r]

0 2
GBP −GRP

5.0

7.5

10.0

12.5

15.0

17.5

m
G

TP (501 stars)

TN (434435 stars)

FP (58 stars)

FN (202 stars)

0 2
GBP −GRP

5.0

7.5

10.0

12.5

15.0

17.5

m
G

TP (424 stars)

TN (434431 stars)

FP (62 stars)

FN (279 stars)

Fig. 6. F1 score as a function of age, mass, half mass radius, and density. The different colors correspond to models trained with Blanco 1, the
Pleiades, NGC 2516, and N-body simulations. Individual dots correspond to the result of testing on clusters from the Gaia data.

Table 4. Characteristics of the training sets and the values of the F1
scores.

Cluster Members Mass Age rh Mean F1
# (M⊙) (Myr) (pc) (%)

Blanco 1 703 338.6 100 6.7 51.4
Pleiades 1407 740.6 125 4.7 33.0
N-body 2139 1164.2 25 4.7 61.4
NGC 2516 2690 1984.8 123 7.9 60.8

Notes. Columns 1 to 5 show the cluster name, number of member stars,
mass, age, and half-mass radius rh. The mean F1 score (last column) is
the average value of the F1 scores obtained by testing on the Gaia data.

clusters) and then tested it on 21 Gaia clusters. Figure 5 shows
the F1 score for all training and testing data combinations. In
the following, we first discuss the impact of the training data and
then the impact of the testing data.

The models trained on N-body simulation, Blanco 1, the
Pleiades, and the NGC 2516 clusters yield average F1 scores1

of 61.4, 51.4, 33.0, and 60.8%, as summarized in Table 4. The
highest accuracy of ≈91.1% is reached when the model trained
on NGC 2516 was tested on NGC 3532. The lowest accuracy of
F1 ⩽ 1% is reached when models trained on Blanco 1 and the
Pleiades were tested on IC 4756. Field stars are misclassified as
22 296 and 11 408 FPs for these two models.

We analyzed the impact of the quantitative properties of the
training sets on the classification accuracy. Table 4 provides the
mean F1 scores averaged over Gaia data for models trained on
N-body simulation, Blanco 1, the Pleiades, and the NGC 2516
clusters. No clear dependence between the accuracy and the
training set mass, age, and rh is apparent. However, the mean F1
score tends to be higher in the models trained on clusters with
more member stars, but the dependence is not monotonic.

Figure 6 shows the scatter plots of the F1 score for different
clusters as a function of the age (left panel), total mass (middle
panel), and half-mass radius (right panel). We do not observe any
clear correlation between the F1 score and these quantities. The
model trained on the Pleiades cluster exhibits an overall lower
accuracy than other models. This is consistent with the lowest
average F1 score of 31.4% mentioned above.

1 Here, we average the F1 scores of all tested Gaia clusters for each
model. When we average, we exclude the instances where the training
set is the same as the testing data.

0 200 400 600 800 1000
Age [Myr]

0

20

40

60

80

F
1

sc
or

e
[%

] Blanco 1

Pleiades

NGC 2516

N-body

500 1000 1500 2000
Total mass [M�]

0

20

40

60

80

5.0 7.5 10.0 12.5 15.0 17.5
rh [pc]

0

20

40

60

80

0 2 4 6 8 10
Parallax [mas]

100

101

102

103

104

105

F
re

qu
en

cy
TP (501 stars)

TN (434435 stars)

FP (58 stars)

FN (202 stars)

0 2 4 6 8 10
Parallax [mas]

100

101

102

103

104

105

F
re

qu
en

cy

TP (424 stars)

TN (434431 stars)

FP (62 stars)

FN (279 stars)

−10010
α [o]

−45

−40

−35

−30

−25

−20

δ
[o

]

−10010
α [o]

−45

−40

−35

−30

−25

−20

δ
[o

]

15.0 17.5 20.0 22.5
µα [mas/yr]

0

2

4

6

µ
δ

[m
as

/y
r]

15 20 25
µα [mas/yr]

−2.5

0.0

2.5

5.0

7.5

µ
δ

[m
as

/y
r]

0 2
GBP −GRP

5.0

7.5

10.0

12.5

15.0

17.5

m
G

TP (501 stars)

TN (434435 stars)

FP (58 stars)

FN (202 stars)

0 2
GBP −GRP

5.0

7.5

10.0

12.5

15.0

17.5

m
G

TP (424 stars)

TN (434431 stars)

FP (62 stars)

FN (279 stars)

Fig. 7. Classification results in terms of parallaxes for the models trained
on N-body simulation (left panels) and the Pleiades (right panel) and
tested on the Praesepe cluster.

As an example of the classification, Fig. 7 shows the his-
tograms of parallaxes for the Blanco 1 cluster. We applied the
model trained on N-body simulation (left panel) and Pleiades
(right panel). The histograms of different colors represent TPs,
TNs, FPs, and FNs according to the key. Overall, the two meth-
ods yield similar results. The accuracy is ≈79.0% and ≈71.0%
for models trained on N-body simulations and the Pleiades.
For the former model, the number of TPs is 501 and that
of FNs is 202. For the latter model, we have 424 TPs and
279 FNs.

The reason for this behavior can be identified by considering
individual stars. Figure 8 shows the FPs, FNs, and TPs with col-
orful dots as indicated in the figure key for individual stars as a
function of coordinates α and δ (top panels), proper motions in
α and δ (middle panels), and magnitudes (bottom panels). The
left panel shows a model trained on N-body simulation, and the
right panel shows a model trained on the Pleiades cluster. The
TP stars of the model trained on the N-body appear to be closer
to the center than the TPs of the model trained on the Pleiades.
This is visible in the top and middle panels. The locations of
FP stars are different in the top and middle panels. FPs are
mainly located near the cluster center for the model trained on
N-body simulations. However, the FPs are primarily located out-
side the cluster for the model trained on the Pleiades. FNs appear
in similar locations, but there are more FNs for models trained
on the Pleiades inside the cluster in the top and middle panels.
The bottom panels show that most FPs and FNs of both mod-
els are low-mass (dim and cold) stars in the bottom right part of
the plot.

A282, page 6 of 10


Bissekenov, A., et al.: A&A, 689, A282 (2024)

0 200 400 600 800 1000
Age [Myr]

0

20

40

60

80

F
1

sc
or

e
[%

] Blanco 1

Pleiades

NGC 2516

N-body

500 1000 1500 2000
Total mass [M�]

0

20

40

60

80

5.0 7.5 10.0 12.5 15.0 17.5
rh [pc]

0

20

40

60

80

−10010
α [o]

−45

−40

−35

−30

−25

−20

δ
[o

]

−10010
α [o]

−45

−40

−35

−30

−25

−20

δ
[o

]

15.0 17.5 20.0 22.5
µα [mas/yr]

0

2

4

6

µ
δ

[m
as

/y
r]

15 20 25
µα [mas/yr]

−2.5

0.0

2.5

5.0

7.5

µ
δ

[m
as

/y
r]

0 2
GBP −GRP

5.0

7.5

10.0

12.5

15.0

17.5

m
G

TP (501 stars)

TN (434435 stars)

FP (58 stars)

FN (202 stars)

0 2
GBP −GRP

5.0

7.5

10.0

12.5

15.0

17.5

m
G

TP (424 stars)

TN (434431 stars)

FP (62 stars)

FN (279 stars)

Fig. 8. Classification results for individual stars for the models trained
on N-body simulation (left panels) and the Pleiades (right panel) and
tested on the Praesepe cluster. The top panel shows coordinates α and δ.
The middle panels show proper motions in α and δ. The bottom panels
show apparent magnitude and color. Due to their large number, the true
negatives are not visualized in these plots.

3.3. Comparison with other methods

We compared the membership derived from our approach to
that from the non-ML method of Röser & Schilbach (2019),
which is based on a modified convergent-point method. At the
same time, we also compared it to the membership from Pang
et al. (2022), who used the unsupervised ML method StarGo.
We chose the Praesepe cluster as an example to carry out the
comparison. We used stars within the same field of view as in
the previous sections (see Sect. 2). Of the 1393 member stars
from Röser & Schilbach (2019), 1079 lie in this field of view.
The stars outside the selected field of view are considered as the
tidal tail of Praesepe.

We display the Venn diagram of the comparison of the three
methods in Fig. 9. The blue circle corresponds to our method,
and the red and gray circles represent the members of Pang
et al. (2022) and Röser & Schilbach (2019), respectively. Our
method identifies 799 stars as members of Praesepe, while Pang
et al. (2022) and Röser & Schilbach (2019) identified 982 and
1079 member stars, respectively. Among all stars, 645 are cross-
matched in all three memberships. Seventy stars are identified
as members that are not recognized by the other two methods.
Two hundred and ten members identified in Röser & Schilbach
(2019) are not recovered in the other two methods, most of which
are located in the extended tidal tails.

70

59

69

210

15

209

645

This work
Pang et al. (2022)

Röser & Schilbach (2019)

Fig. 9. Venn diagram of the comparison of three membership identi-
fication methods. Blue corresponds to our method, and gray and red
correspond to Röser & Schilbach (2019) and Pang et al. (2022), respec-
tively.

Figure 10 shows color-magnitude diagrams (CMDs) for the
member stars identified by these three methods. All three show
similar patterns. Our method identifies more faint stars with low
effective temperatures because we did not apply a quality cut to
the stars before membership identification. These faint stars have
larger observational uncertainties than most of the other stars.

The left panel of Fig. 11 displays the distribution of stars by
the proper motions in α and δ. Members from our method are
very concentrated around µδ ≈ −12 and µα ≈ 35 mass/yr within
a radius of ≈4 mass/yr. Members of Pang et al. (2022) are more
extended than this region roughly within a radius of ≈5 mass/yr.
The members identified by Röser & Schilbach (2019) are dis-
tributed most widely, well outside (≳10 mass/yr) of the proper
motion center. This is consistent with the radial distribution
shown in the middle panel of Fig. 11: 4 member stars identified
by our method are located farther away than 15 pc from the clus-
ter center, while of the Pang et al. (2022) and Röser & Schilbach
(2019) members, 73 and 194 stars lie more than 15 pc away from
the cluster center. Similarly, the parallax distribution (right panel
of Fig. 11) reveals that the member star identified by our method
is confined to a central concentrated region. The main reason
for this difference stems from the fact that our method uses the
proper motion and parallax for the training.

4. Conclusion

We performed a membership analysis of stellar clusters using
a supervised ML algorithm. We trained and tested our models
on snapshots data from N-body simulations of stellar clusters
and observed clusters from the Gaia DR3 data. Our findings are
summarized below.

We studied five supervised ML algorithms on N-body sim-
ulation data. These models are RF, DT, SVM, FFNN, and KNN.
All models produced comparable accuracies within ≈1% (see
Appendix A). Following this result, we used the RF method for
the rest of the paper.

We then explored the impact of eight different observational
parameters on the accuracy of the membership identification.

A282, page 7 of 10


Bissekenov, A., et al.: A&A, 689, A282 (2024)

0 1 2 3
GBP −GRP

6

8

10

12

14

16

18

m
G

This work

0 1 2 3
GBP −GRP

6

8

10

12

14

16

18
m
G

Pang et al. (2022)

0 1 2 3
GBP −GRP

6

8

10

12

14

16

18

m
G

Röser & Schilbach (2019)

−50−40−30−20
µα [mas/yr]

−25

−20

−15

−10

−5

0

µ
δ

[m
as

/y
r]

0 10 20
r [pc]

0

20

40

60

N
u

m
b

er
of

so
u

rc
es

5.0 5.5 6.0
Parallax [mas]

0

20

40

60

N
u

m
b

er
of

so
u

rc
es

Fig. 10. Comparison of three membership identifications in the color-magnitude diagrams.

0 1 2 3
GBP −GRP

6

8

10

12

14

16

18

m
G

This work

0 1 2 3
GBP −GRP

6

8

10

12

14

16

18
m
G

Pang et al. (2022)

0 1 2 3
GBP −GRP

6

8

10

12

14

16

18

m
G

Röser & Schilbach (2019)

−50−40−30−20
µα [mas/yr]

−25

−20

−15

−10

−5

0

µ
δ

[m
as

/y
r]

0 10 20
r [pc]

0

20

40

60

N
u

m
b

er
of

so
u

rc
es

5.0 5.5 6.0
Parallax [mas]

0

20

40

60

N
u

m
b

er
of

so
u

rc
es

Fig. 11. Comparison of three membership labels in terms of proper motions in α and δ (left panel), radial distribution (middle panel), and paral-
laxes (right panel). Blue, red, and black represent the member stars identified by our method, Pang et al. (2022), and Röser & Schilbach (2019),
respectively.

Five of these eight were astrometric parameters: right ascen-
sion (α), declination (δ), proper motions in α (µα) and δ (µδ),
and parallax (π), and two were photometric parameters: appar-
ent magnitude m, and color parameter GBP − GRP. The radial
velocity was the last parameter. It is measured with spectroscopy.
We tested five different combinations of these eight parameters.
The highest accuracy was observed in models trained on purely
astrometric parameters α, δ, µα, µδ, and π. The parallax is the
most critical parameter in a member classification. The inclusion
of rv and photometric features do not seem to improve accu-
racy. However, this might be biased by the current observational
uncertainty of rv.

We studied the impact of the SFE and the cluster age on
the classification accuracy using N-body simulation data. The F1
score accuracy, which defines the reliability of our ML method,
is ≳90% in all snapshots before cluster dissolution. When the
cluster is mainly dissolved, the F1 score accuracy drops to ≈70%.
The errors in classification during the dissolution are primarily
due to FNs. The statistical results of three snapshots of differ-
ent SFEs at different times show that the accuracy is similar
regardless of the time and SFE of the snapshot used for training.

Additionally, we explored the impact of the number of
member and nonmember stars in datasets. Generally, for the
successful performance of the supervised ML models, it is desir-
able to have a similar number of samples in all the classes.
The datasets of OCs typically contain a large number of field

stars that are not members. We found that the number of field
stars within the training dataset does not affect the classification
accuracy.

We applied our model to 21 clusters from the Gaia DR3 data.
These clusters are Blanco 1, Collinder 69, Huluwa 3, IC 4756, LP
2373 gp2, LP 2373 gp4, LP 2383, LP 2442, Mamajek 4, NGC
1980, NGC 2422, NGC 2451B, NGC 2516, NGC 3532, NGC
6405, NGC 6475, NGC 6633, the Pleiades, Praesepe, Stephenson
1, and UBC 31. In addition to N-body simulation data, we trained
our model on Blanco 1, the Pleiades, and the NGC 2516 clus-
ters. The models trained on N-body simulation and NGC 2516
yield an average F1 of ≈60%. The models trained on Blanco
1 and the Pleiades show a lower average F1 score of ≈51% and
≈32%. The models trained on clusters (trained on NGC 2516 and
N-body simulation) with a larger number of member stars tend to
yield a higher classification accuracy, but the dependence is not
monotonic. The two models with the highest accuracy have over
2000 member stars in the training set. In contrast, the two models
with the lowest accuracy (trained on Blanco 1 and the Pleiades)
have fewer than 2000 members. Among these clusters, we find
no noticeable correlation between the classification accuracy and
the mass, age, and half-mass radius (rh) of clusters.

We compared our membership determination results with the
membership of Pang et al. (2022) and Röser & Schilbach (2019)
for the Praesepe cluster. In total, 645 member stars were cross-
matched in all three methods. Our model retrieved 124 and 280

A282, page 8 of 10


Bissekenov, A., et al.: A&A, 689, A282 (2024)

fewer stars than Pang et al. (2020) and Röser & Schilbach (2019),
respectively. The members identified by our method are more
concentrated in both space distribution and proper motion distri-
bution than in the other two methods. Our models were trained
on a five-parameter space: right accession, declination, parallax,
and proper motions. Therefore, parallax and proper motion play
roles that place a greater weight on member identification.

Our work suggests that ML approaches are promising for a
membership analysis despite several limitations. We used a lim-
ited number of 21 Gaia clusters. The inclusion of more clusters
should improve the accuracy. It is also worthwhile to compare
the unsupervised learning approach with other ML methods
(e.g., StarGO, DBSCAN, HDBSCAN, or GMMs). Moreover,
cross-matching the results with other studies of a membership
analysis can further improve the results. These limitations will
be addressed in future studies.

Acknowledgements. The authors would like to thank the anonymous referee for
his/her valuable comments and suggestions, which have helped to improve the
quality of this manuscript. This research has been funded by the Science Com-
mittee of the Ministry of Education and Science, Republic of Kazakhstan (Grant
No. AP13067834, AP19677351, and Program No. BR20280974). Additionally,
funding is provided through the Nazarbayev University Faculty Development
Competitive Research Grant Program, with Grant No. 11022021FD2912. Xiaoy-
ing Pang acknowledges the financial support of the National Natural Science
Foundation of China through grants 12173029 and 12233013. Peter Berczik
thanks the support from the special program of the Polish Academy of Sciences
and the US National Academy of Sciences under the Long-term program to sup-
port the Ukrainian research teams grant No. PAN.BFB.S.BWZ.329.022.2023.
The data used in this study can be obtained from the authors upon request.

References
Agarwal, M., Rao, K. K., Vaidya, K., & Bhattacharya, S. 2021, MNRAS, 502,

2582
Ascenso, J., Alves, J., & Lago, M. T. V. T. 2009, A&A, 495, 147
Astropy Collaboration (Robitaille, T. P., et al.) 2013, A&A, 558, A33
Bação, F., Lobo, V., & Painho, M. 2005, in Computational Science–ICCS

2005: 5th International Conference, Atlanta, GA, USA, May 22–25, 2005,
Proceedings, Part III 5, Springer, 476

Bebis, G., & Georgiopoulos, M. 1994, IEEE Potentials, 13, 27
Bishop, C. M. 2006, Pattern Recognition and Machine Learning (Springer)
Bland-Hawthorn, J., & Gerhard, O. 2016, ARA&A, 54, 529
Bouma, L. G., Curtis, J. L., Hartman, J. D., Winn, J. N., & Bakos, G. Á. 2021,

AJ, 162, 197
Breiman, L. 2001, Mach. Learn., 45, 5
Breiman, L., Friedman, J., Olshen, R., & Stone, C. 1984, Classification and

Regression Trees (Chapman and Hall/CRC)
Campello, R. J. G. B., Moulavi, D., & Sander, J. 2013, in Advances in Knowledge

Discovery and Data Mining, eds. J. Pei, V. S. Tseng, L. Cao, H. Motoda, &
G. Xu (Berlin, Heidelberg: Springer Berlin Heidelberg), 160

Cantat-Gaudin, T., Jordi, C., Vallenari, A., et al. 2018, A&A, 618, A93
Cantat-Gaudin, T., Anders, F., Castro-Ginard, A., et al. 2020, A&A, 640, A1
Castro-Ginard, A., Jordi, C., Luri, X., et al. 2020, A&A, 635, A45
Chen, L., Hou, J. L., & Wang, J. J. 2003, AJ, 125, 1397
Chen, Y., Girardi, L., Fu, X., et al. 2019, A&A, 632, A105
Conrad, C., Scholz, R. D., Kharchenko, N. V., et al. 2017, A&A, 600, A106
Cortes, C., & Vapnik, V. 1995, Mach. Learn., 20, 273
Cover, T. M., & Hart, P. E. 1967, Inform. Theory IEEE Trans., 13, 21
Das, S. R., Gupta, S., Prakash, P., Samal, M., & Jose, J. 2023, ApJ, 948, 7
Deb, S., Baruah, A., & Kumar, S. 2022, MNRAS, 515, 4685
de la Fuente Marcos, R., & de la Fuente Marcos, C. 2004, New A, 9, 475
Dias, W. S., Monteiro, H., Caetano, T. C., et al. 2014, A&A, 564, A79
Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. 1996, in Proceedings of the Second

International Conference on Knowledge Discovery and Data Mining (KDD)
(ACM), 226

Fürnkranz, J. 2010, Decision Tree, eds. C. Sammut, & G. I. Webb (Boston, MA:
Springer US), 263

Gaia Collaboration (Brown, A. G. A., et al.) 2016a, A&A, 595, A2
Gaia Collaboration (Prusti, T., et al.) 2016b, A&A, 595, A1
Gaia Collaboration (Brown, A. G. A., et al.) 2018, A&A, 616, A1
Gaia Collaboration (Brown, A. G. A., et al.) 2021, A&A, 649, A1

Gaia Collaboration (Vallenari, A., et al.) 2023, A&A, 674, A1
Gao, X.-H. 2014, Res. Astron. Astrophys., 14, 159
Gao, X. 2018a, ApJ, 869, 9
Gao, X.-H. 2018b, Ap&SS, 363, 232
Gao, X.-h. 2019a, PASP, 131, 044101
Gao, X.-h. 2019b, MNRAS, 486, 5405
Ghosh, E. M., Sulistiyowati, Tucio, P., & Fajrin, M. 2022, J. Phys. Conf. Ser.,

2214, 012009
Goutte, C., & Gaussier, E. 2005, in Lecture Notes in Computer Science, 3408,

345
Guido, R. M. D., Tucio, P. B., Kalaw, J. B., & Geraldo, L. E. 2023, in IOP

Conference Series: Earth and Environmental Science, 1167, 012010
Hunt, E. L., & Reffert, S. 2021, A&A, 646, A104
Hunt, E. L., & Reffert, S. 2023, A&A, 673, A114
Jadhav, V. V., Pennock, C. M., Subramaniam, A., Sagar, R., & Nayak, P. K. 2021,

MNRAS, 503, 236
Jaehnig, K., Bird, J., & Holley-Bockelmann, K. 2021, ApJ, 923, 129
Just, A., Berczik, P., Petrov, M. I., & Ernst, A. 2009, MNRAS, 392, 969
Kalambay, M. T., Naurzbayeva, A. Z., Otebay, A. B., et al. 2022, Recent Contrib.

Phys., 83, 4
Kharchenko, N. V., Piskunov, A. E., Röser, S., Schilbach, E., & Scholz, R. D.

2005, A&A, 438, 1163
Kharchenko, N. V., Piskunov, A. E., Schilbach, E., Röser, S., & Scholz, R. D.

2012, A&A, 543, A156
Kohl, M. 2012, Int. J. Statist. Med. Res., 1, 79
Kounkel, M., & Covey, K. 2019, AJ, 158, 122
Krone-Martins, A., & Moitinho, A. 2015, ascl:1504.001
Krumholz, M. R., McKee, C. F., & Bland-Hawthorn, J. 2019, ARA&A, 57, 227
Lada, C. J., & Lada, E. A. 2003, ARA&A, 41, 57
Liu, L., & Pang, X. 2019, ApJS, 245, 32
Lodieu, N., Pérez-Garrido, A., Smart, R. L., & Silvotti, R. 2019, A&A, 628, A66
Mahmudunnobe, M., Hasan, P., Raja, M., & Hasan, S. N. 2021, Eur. Phys. J.

Special Top., 230, 2177
Maíz Apellániz, J., & Weiler, M. 2018, A&A, 619, A180
Meingast, S., & Alves, J. 2019, A&A, 621, A3
Meingast, S., Alves, J., & Rottensteiner, A. 2021, A&A, 645, A84
Noormohammadi, M., Khakian Ghomi, M., & Haghi, H. 2023, MNRAS, 523,

3538
Olivares, J., Lodieu, N., Béjar, V. J. S., et al. 2023, A&A, 675, A28
Pang, X., Li, Y., Tang, S.-Y., Pasquato, M., & Kouwenhoven, M. B. N. 2020, ApJ,

900, L4
Pang, X., Tang, S.-Y., Li, Y., et al. 2022, ApJ, 931, 156
Pang, X., Liao, S., Li, J., et al. 2024, ApJ, 966, 169
Parmentier, G., & Pfalzner, S. 2013, in Protostars and Planets VI Posters
Paszke, A., Gross, S., Massa, F., et al. 2019, in Advances in Neural Informa-

tion Processing Systems 32, eds. H. Wallach, H. Larochelle, A. Beygelzimer,
F. d'Alché-Buc, E. Fox, & R. Garnett (Curran Associates, Inc.), 8024

Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, J. Mach. Learn. Res., 12,
2825

Pera, M. S., Perren, G. I., Moitinho, A., Navone, H. D., & Vazquez, R. A. 2021,
A&A, 650, A109

Peterson, L. E. 2009, Scholarpedia, 4, 1883
Phelps, R. L., & Janes, K. A. 1994, ApJS, 90, 31
Portegies Zwart, S. F., McMillan, S. L. W., & Gieles, M. 2010, ARA&A, 48, 431
Renaud, F. 2018, New A Rev., 81, 1
Röser, S., & Schilbach, E. 2019, A&A, 627, A4
Röser, S., Schilbach, E., & Goldman, B. 2019, A&A, 621, A2
Ruprecht, J., Balázs, B., & White, R. E. 1981, Akad. Kiado, 0
Rybizki, J. 2019, Galaxia_wrap: Galaxia wrapper for generating mock stellar

surveys, Astrophysics Source Code Library, record ascl:1901.005
Sharma, S., Bland-Hawthorn, J., Johnston, K. V., & Binney, J. 2011, ApJ, 730, 3
Shukirgaliyev, B., Parmentier, G., Berczik, P., & Just, A. 2017, A&A, 605, A119
Shukirgaliyev, B., Otebay, A., Just, A., et al. 2019, Reports of NAS RK. Physico-

mathematical series, 130
Shukirgaliyev, B., Otebay, A., Sobolenko, M., et al. 2021, A&A, 654, A53
Sindhu Meena, K., & Suriya, S. 2020, in Proceedings of International Conference

on Artificial Intelligence, Smart Grid and Smart City Applications, eds. L. A.
Kumar, L. S. Jayashree, & R. Manimegalai (Cham: Springer International
Publishing), 627

Stock, J. 1956, ApJ, 123, 258
Tang, S.-Y., Pang, X., Yuan, Z., et al. 2019, ApJ, 877, 12
Tucio, P. B., Guido, R. M. D., & Kalaw, J. B. 2023, IOP Conf. Ser.: Earth

Environ. Sci., 1167, 012002
van Groeningen, M. G. J., Castro-Ginard, A., Brown, A. G. A., Casamiquela, L.,

& Jordi, C. 2023, A&A, 675, A68
Yuan, Z., Chang, J., Banerjee, P., et al. 2018, ApJ, 863, 26

A282, page 9 of 10

http://linker.aanda.org/10.1051/0004-6361/202449791/1
http://linker.aanda.org/10.1051/0004-6361/202449791/1
http://linker.aanda.org/10.1051/0004-6361/202449791/2
http://linker.aanda.org/10.1051/0004-6361/202449791/3
http://linker.aanda.org/10.1051/0004-6361/202449791/4
http://linker.aanda.org/10.1051/0004-6361/202449791/4
http://linker.aanda.org/10.1051/0004-6361/202449791/5
http://linker.aanda.org/10.1051/0004-6361/202449791/6
http://linker.aanda.org/10.1051/0004-6361/202449791/7
http://linker.aanda.org/10.1051/0004-6361/202449791/8
http://linker.aanda.org/10.1051/0004-6361/202449791/9
http://linker.aanda.org/10.1051/0004-6361/202449791/10
http://linker.aanda.org/10.1051/0004-6361/202449791/10
http://linker.aanda.org/10.1051/0004-6361/202449791/11
http://linker.aanda.org/10.1051/0004-6361/202449791/11
http://linker.aanda.org/10.1051/0004-6361/202449791/12
http://linker.aanda.org/10.1051/0004-6361/202449791/13
http://linker.aanda.org/10.1051/0004-6361/202449791/14
http://linker.aanda.org/10.1051/0004-6361/202449791/15
http://linker.aanda.org/10.1051/0004-6361/202449791/16
http://linker.aanda.org/10.1051/0004-6361/202449791/17
http://linker.aanda.org/10.1051/0004-6361/202449791/18
http://linker.aanda.org/10.1051/0004-6361/202449791/19
http://linker.aanda.org/10.1051/0004-6361/202449791/20
http://linker.aanda.org/10.1051/0004-6361/202449791/21
http://linker.aanda.org/10.1051/0004-6361/202449791/22
http://linker.aanda.org/10.1051/0004-6361/202449791/23
http://linker.aanda.org/10.1051/0004-6361/202449791/24
http://linker.aanda.org/10.1051/0004-6361/202449791/24
http://linker.aanda.org/10.1051/0004-6361/202449791/26
http://linker.aanda.org/10.1051/0004-6361/202449791/27
http://linker.aanda.org/10.1051/0004-6361/202449791/28
http://linker.aanda.org/10.1051/0004-6361/202449791/29
http://linker.aanda.org/10.1051/0004-6361/202449791/30
http://linker.aanda.org/10.1051/0004-6361/202449791/31
http://linker.aanda.org/10.1051/0004-6361/202449791/32
http://linker.aanda.org/10.1051/0004-6361/202449791/33
http://linker.aanda.org/10.1051/0004-6361/202449791/34
http://linker.aanda.org/10.1051/0004-6361/202449791/35
http://linker.aanda.org/10.1051/0004-6361/202449791/36
http://linker.aanda.org/10.1051/0004-6361/202449791/36
http://linker.aanda.org/10.1051/0004-6361/202449791/37
http://linker.aanda.org/10.1051/0004-6361/202449791/38
http://linker.aanda.org/10.1051/0004-6361/202449791/38
http://linker.aanda.org/10.1051/0004-6361/202449791/39
http://linker.aanda.org/10.1051/0004-6361/202449791/40
http://linker.aanda.org/10.1051/0004-6361/202449791/41
http://linker.aanda.org/10.1051/0004-6361/202449791/42
http://linker.aanda.org/10.1051/0004-6361/202449791/43
http://linker.aanda.org/10.1051/0004-6361/202449791/44
http://linker.aanda.org/10.1051/0004-6361/202449791/44
http://linker.aanda.org/10.1051/0004-6361/202449791/45
http://linker.aanda.org/10.1051/0004-6361/202449791/46
http://linker.aanda.org/10.1051/0004-6361/202449791/47
http://linker.aanda.org/10.1051/0004-6361/202449791/48
http://linker.aanda.org/10.1051/0004-6361/202449791/50
http://linker.aanda.org/10.1051/0004-6361/202449791/51
http://linker.aanda.org/10.1051/0004-6361/202449791/52
http://linker.aanda.org/10.1051/0004-6361/202449791/53
http://linker.aanda.org/10.1051/0004-6361/202449791/54
http://linker.aanda.org/10.1051/0004-6361/202449791/54
http://linker.aanda.org/10.1051/0004-6361/202449791/55
http://linker.aanda.org/10.1051/0004-6361/202449791/56
http://linker.aanda.org/10.1051/0004-6361/202449791/57
http://linker.aanda.org/10.1051/0004-6361/202449791/58
http://linker.aanda.org/10.1051/0004-6361/202449791/58
http://linker.aanda.org/10.1051/0004-6361/202449791/59
http://linker.aanda.org/10.1051/0004-6361/202449791/60
http://linker.aanda.org/10.1051/0004-6361/202449791/60
http://linker.aanda.org/10.1051/0004-6361/202449791/61
http://linker.aanda.org/10.1051/0004-6361/202449791/62
http://linker.aanda.org/10.1051/0004-6361/202449791/63
http://linker.aanda.org/10.1051/0004-6361/202449791/64
http://linker.aanda.org/10.1051/0004-6361/202449791/64
http://linker.aanda.org/10.1051/0004-6361/202449791/65
http://linker.aanda.org/10.1051/0004-6361/202449791/65
http://linker.aanda.org/10.1051/0004-6361/202449791/66
http://linker.aanda.org/10.1051/0004-6361/202449791/67
http://linker.aanda.org/10.1051/0004-6361/202449791/68
http://linker.aanda.org/10.1051/0004-6361/202449791/69
http://linker.aanda.org/10.1051/0004-6361/202449791/70
http://linker.aanda.org/10.1051/0004-6361/202449791/71
http://linker.aanda.org/10.1051/0004-6361/202449791/72
http://linker.aanda.org/10.1051/0004-6361/202449791/73
http://linker.aanda.org/10.1051/0004-6361/202449791/74
http://linker.aanda.org/10.1051/0004-6361/202449791/74
http://linker.aanda.org/10.1051/0004-6361/202449791/75
http://linker.aanda.org/10.1051/0004-6361/202449791/76
http://linker.aanda.org/10.1051/0004-6361/202449791/77
http://linker.aanda.org/10.1051/0004-6361/202449791/77
http://linker.aanda.org/10.1051/0004-6361/202449791/78
http://linker.aanda.org/10.1051/0004-6361/202449791/79
http://linker.aanda.org/10.1051/0004-6361/202449791/79
http://linker.aanda.org/10.1051/0004-6361/202449791/80
http://linker.aanda.org/10.1051/0004-6361/202449791/81
http://linker.aanda.org/10.1051/0004-6361/202449791/82
http://linker.aanda.org/10.1051/0004-6361/202449791/82
http://linker.aanda.org/10.1051/0004-6361/202449791/83
http://linker.aanda.org/10.1051/0004-6361/202449791/84


Bissekenov, A., et al.: A&A, 689, A282 (2024)

102 103

Time [Myr]

70

75

80

85

90

95

100
F

1
sc

or
e

[%
]

RF

DT

SVM

KNN

NN

17% SFE

20% SFE

25% SFE

Fig. A.1. Average F1 score as a function of time for different ML mod-
els.

Appendix A: Dependence on the ML model

In this section, we check how our results depend on the ML
model. We applied RF, DT, SVM, FFNN, and KNN models.

Figure A.1 shows the F1 score as a function of time for dif-
ferent SFE values and different ML models. The colored lines
represent F1 score accuracy as a function of time for the RF, DT,
SVM, FFNN, and KNN models. The training is performed on
N-body simulation snapshots at 20 Myr. All ML models exhibit
similar F1 scores. The difference between different models does
not exceed ≈1%. The RF model performs slightly better within
this difference, especially for t ≲ 200 Myr. We thus adopted the
RF model in the main part of our work.

A282, page 10 of 10


	Cluster membership analysis with supervised learningand N-body simulations
	1 Introduction
	2 Method
	2.1 Data
	2.1.1 N-body simulations
	2.1.2 Gaia data

	2.2 Machine-learning

	3 Results
	3.1 Test on N-body simulation data
	3.2 Tests on observational data
	3.3 Comparison with other methods

	4 Conclusion
	Acknowledgements
	References
	Appendix A: Dependence on the ML model