The influence of the natural environment on human habitation is a crucial aspect of understanding past human-environment interactions and remains a focal point of multidisciplinary research. Ancient human settlement choices were the result of a comprehensive consideration of various environmental factors, aligning with the socio-economic needs of the time. Traditional GIS-based methods have demonstrated the impact of individual environmental elements on human settlement but fail to objectively present the relative importance of different environmental factors. Archaeological predictive models, which primarily rely on logistic regression and maximum entropy modeling, offer a quantitative description of the relationship between ancient settlements and their environments. Recently, the rise of artificial intelligence and machine learning has provided new approaches for archaeological predictive model research, though their potential remains largely untested.
In this context, a research team led by Professor Dong Guanghui from the School of Earth and Environmental Sciences, in collaboration with the Geographic Information Systems team, focused on the northeastern Tibetan Plateau—a region with significant natural environmental variations and distinct transitions in human activity from the Neolithic to the Bronze Age. The team employed various machine learning models to explore the settlement strategies of different cultural groups and to predict the distribution probabilities of archaeological sites and cultural boundaries within the study area. The research utilized cultural attributes of archaeological sites (including Yangshao-Majiayao YS-MJY, Qijia QJ, Kayue-Xindian-Nuomuhong KXN) and simulated random points (representing "non-site" locations) as categorical variables (dependent variables), with numerous environmental factors (including topography, vegetation, land suitability, hydrology, soil, and climate—comprising 16 variables in six categories) serving as predictors. A supervised random forest model was used for predicting archaeological probability distributions, while unsupervised self-organizing maps (SOM) and the importance ranking generated by random forests were used to interpret the environmental selection differences among different cultural groups.
Figure 1: Technical Roadmap
The model's classification results were evaluated using out-of-bag estimates, holdout method (80% training set, 20% validation set), and 10-fold cross-validation. The outcomes were further validated through Kvamme gain analysis, demonstrating that (1) the random forest model effectively predicts archaeological site distribution probabilities (Kvamme gain G=0.89, indicating higher predictive performance as it approaches 1). Variable importance rankings indicated that elevation, arable land suitability, climate, and soil erosion were the most critical factors influencing human environmental choices. (2) The SOM mapping results combined with studies on climate background and livelihood strategies revealed differentiated environmental selection strategies and driving mechanisms among cultural groups: the Yangshao-Majiayao populations lived at lower elevations under suitable hydrothermal conditions, aligning with the agricultural cultivation of millet and hunting of wild animals prevalent in their society; the Qijia culture adapted to the worsening climate with a pastoral economy, marking a crucial transition from the Neolithic to the Bronze Age; the Kayue-Xindian-Nuomuhong culture absorbed external crops (cattle, sheep, wheat, etc.) on a large scale, breaking through existing environmental limitations and adapting to more complex high-altitude environments.
Figure 2: Predictions by the Random Forest Model on the Distribution Range of All Archaeological Sites (a); and on the Distribution Ranges of Different Cultural Periods (c-d).
Figure 3: Changes in Climate Backgrounds (a) and Livelihood Patterns (b-c) across Different Cultural Periods.
This recent study titled "GIS and Machine Learning Models Target Dynamic Settlement Patterns and Their Driving Mechanisms from the Neolithic to Bronze Age in the Northeastern Tibetan Plateau" was published in the internationally renowned journal Remote Sensing. The research was led by Ph.D. student Li Gang as the first author and Ph.D. student Dong Jiajia as the corresponding author. The project was supported by the National Natural Science Foundation of China International (Regional) Cooperation and Exchange Program (No. 42261144670), the Second Comprehensive Scientific Investigation of the Tibetan Plateau (2019QZKK0601), the Yunnan Province Dong Guanghui Expert Workstation project (202305AF150183), and the European Research Council project (ERC-2019-ADG-883700-TRAM).