پهنه‌بندی احتمال وقوع سیل با استفاده از بررسی مقایسه‌ای دو مدل شناخته شده جنگل تصادفی و ماشین بردار پشتیبان در شمال ایران

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجوی دکتری، گروه مهندسی آب، دانشکده کشاورزی، واحد کرمانشاه، دانشگاه آزاد اسلامی، کرمانشاه، ایران.

2 دانشیار، گروه مهندسی آب، دانشکده کشاورزی، واحد کرمانشاه، دانشگاه آزاد اسلامی، کرمانشاه، ایران.

3 استادیار، گروه مهندسی آب، دانشکده کشاورزی، واحد کرمانشاه، دانشگاه آزاد اسلامی، کرمانشاه، ایران

چکیده

هدف از پژوهش پیش رو، پهنه بندی احتمال وقوع سیل در حوزه آبخیز سالیان تپه، واقع در استان گلستان میباشد. بدین منظور، از دو مدل معروف و شناخته شده‌ی داده‌کاوی یعنی مدل جنگل تصادفی(RF) و مدل ماشین بردار پشتیبان (SVM) به عنوان بنچ‌مارک و به لحاظ الگوریتم محاسباتی توانمند در زمینه ارزیابی فرایند وقوع سیلاب استفاده شد. شواهد سیلاب با استفاده از بازدیدهای میدانی،گزارش‌ها و اطلاعات سازمانی موجود ثبت و در سامانه اطلاعات جغرافیاییGIS)) در قالب نقشه تهیه شد. همچنین، با توجه به مرور منابع گسترده، سیزده عامل زمینه‌ساز شامل فاصله از آبراهه، واحدهای سنگ‌شناسی، درصد شیب، بافت خاک، جهت شیب، کاربری اراضی، انحنای طولی و عرضی دامنه، شاخص رطوبت، شاخص توان فرسایشی آبراهه و طبقات ارتفاعی به‌عنوان عوامل موثر بر وقوع سیل در منطقه مورد مطالعه انتخاب و لایه‌های مذکور در سامانه اطلاعات جغرافیایی تهیه شدند. در این مطالعه بعد از آماده‌سازی لایه‌ها، برای آنالیز این داده‌ها و بررسی هم‌خطی آنها از نرم‌افزار SPSS استفاده شد. به‌منظور ارزیابی نتایج مدلها، از مقدار مساحت زیر منحنی تشخیص عملکرد نسبی ROC)) استفاده شد. سه سری متفاوت از نقاط وقوع خطر سیل (S1, S2, S3 ( شامل 70 درصد برای آموزش مدل و 30 درصد برای اعتبار سنجی به صورت تصادفی آماده شد تا دقت و صداقت مدل مورد ارزیابی قرار بگیرد. نتایج نشان داد نقشه میانگین حاصل از مدل جنگل تصادفی در مرحله اعتبارسنجی با مساحت زیر منحنی 96 درصد و صداقت 001/0 کارایی بهتری نسبت به مدل ماشین بردار پشتیبان در پهنه‌بندی سیلاب در حوضه مورد مطالعه دارد .

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Flood probability zonation using a comparative study of two well-known random forest and support vector machine models in northern Iran

نویسندگان [English]

  • Mohammad Reza Tahmasebi 1
  • Saeid Shabanlou 2
  • Ahmad Rajabi 3
  • Fariborz Yosefvand 3
1 Ph.D. Candidate, Department of Water Engineering, College of Agriculture, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran.
2 Associate Professor, Department of Water Engineering, College of Agriculture, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran
3 Assistant Professor, Department of Water Engineering, College of Agriculture, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran
چکیده [English]

The current study is aimed to zoning flood probability map in the Saliantapeh catchment is located in the Golestan Province. To this aim, two well-known data mining models namely Random Forest (RF) and Support Vector Machine (SVM) were applied due to their robust computational algorithm. Flood inventories were gathered through several field surveys using, local information and available organizational resources and corresponding map was created in the geographic information system. Reviewing several worldwide studies, 13 predisposing variables including proximity to stream, soil texture, lithological units, land use/cover, slope percent, elevation/DEM, slope aspect, plan curvature, profile curvature, stream power index and topographic wetness index were chosen and the corresponding maps were generated in the geographic information system. In this study, after preparing the predictor maps, SPSS software was used to analyze this data and testing Multi-collinearity. In order to evaluate models’ results the area under the receiver operating were used. Three different sample data sets (s1, s2, s3) including 70% for training and 30% for validation were randomly gathered to evaluate the robustness of the applied models. Results showed that the RF model with the area under curve value of 0.96 and robustness of 0,001 in validation step had better performance on flood probability zonation over the study area.

کلیدواژه‌ها [English]

  • Flood
  • Golestan Province
  • Random forest
  • Robustness
  • Support vector machine
  1. Abdi, P. (2006). Investigation of flood potential of Zanjan River basin by SCS method and GIS. National Irrigation and Drainage Committee. Technical workshop on coexistence with floods. (In Persion)
  2. Akgün, A., & Bulut, F. (2007). GIS-based landslide susceptibility for Arsin-Yomra (Trabzon, North Turkey) region. Environment Geology, 51(8), 1377-1387.
  3. Albers, S. J., Déry, S. J., & Petticrew, E. L. (2016). Flooding in the Nechako River Basin of Canada: A random forest modeling approach to flood analysis in a regulated reservoir system. Canadian Water Resources Journal/Revue canadienne des ressources hydriques, 41(1-2), 250-260.
  4. Angileri, S.E., Conoscenti, C., Hochschild, V., Märker, M., Rotigliano, E., & Agnesi, V. (2016). Water erosion susceptibility mapping by applying Stochastic Gradient Treeboost to the Imera Meridionale River basin (Sicily, Italy). Geomorphology. 262, 61-76.
  5. Bui, D.T., Khosravi, K., Shahabi, H., Daggupati, P., Adamowski, J.F., Melesse, A., Pham, B.T., Pourghasemi, H.R., Mahmoodi, M., Bahrami, S., Pradhan, B., Shirzadi, A., Chapi, K., & Lee, S. (2019). Flood Spatial Modeling in Northern Iran Using Remote Sensing and GIS: A Comparison between Evidential Belief Functions and Its Ensemble with a Multivariate Logistic Regression Model. Remote Sensing, 11(13), 1589.
  6. Chen, W., Li, Y., Xue, W., Shahabi, H., Li, S., Hong, H., & Ahmad, B.B. (2020). Modeling flood susceptibility using data-driven approaches of naïve bayes tree, alternating decision tree, and random forest methods. Science of The Total Environment, 701, 134-979.
  7. Conoscenti, C., Angileri, S., Cappadonia, C., Rotigliano, E., Agnesi, V., & Märker, M. (2014). Gully erosion susceptibility assessment by means of GIS-based logistic regression: a case of Sicily (Italy). Geomorphology, 204, 399-411.
  8. Dickie, J.A., & Parsons, A.J. (2012). Eco‐geomorphological processes within grasslands, shrublands and badlands in the semi‐arid Karoo, South Africa. Land Degradation Dev., 23(6), 534-547.
  9. Daoud, J.I. (2017). Multicollinearity and regression analysis. J. Phy, Conference Series (949(1), 012009). IOP Publishing.
  10. Felicĺsimo, Á., Cuartero, A., Remondo, J., & Quirόs, E. (2013). Mapping landslide susceptibility with logistiv regression, multiple adaptive regression splines, classification and regression tress, and maximum entropy methods: a comparative study. Landslides, 10, 175-189.
  11. Gayen, A., Pourghasemi, H.R., Saha, S., Keesstra, S., & Bai, S. (2019). Gully erosion susceptibility assessment and management of hazard-prone areas in India using different machine learning algorithms. Science of the Total Environment, 668, 124-138.
  12. Guzzetti, F., Cardinali, M., Reichenbach, P., & Carrara, A. (2000). Comparing landslide maps: A case study in the upper Tiber River Basin, central Italy. Environmental Management, 25(3), 247-263.
  13. Glenn, E., Morino, K., Nagler, P., Murray, R., Pearlstein, S., & Hultine, K. (2012). Roles of saltcedar (Tamarix spp.) and capillary rise in salinizing a non-flooding terrace on a flow-regulated desert river. Journal of Arid Environment, 79, 56-65.
  14. Hall, A. J. (1981). Flash flood forecasting. World Meteorological Organization (WMO (Series); no. 577.), Operational hydrology report (World Meteorological Organization); 18, 48.
  15. Hosmer, D. W., & Lemeshow, S. (2000). Multiple Logistic Regression. Hoboken, NJ: John Wiley & Sons, Inc. doi: 10.1002/0471722146.ch2.
  16. Jafarian, Z., & Kargar, M. (2017). Comparison of Random Forest (RF) and Boosting Regression Tree (BRT) For Prediction of Dominant Plant Species Presence in Polour Rangelands, Mazandaran Province. Iranian Journal of Applied Ecology, 6(1), 41-55.
  17. Kheyrizadeh, M., J. Maleki and H. Amounia. 2012. Flood hazard zoning using ANP model in watershed, case study: Mardaghchay Watershed. Quantitative Geomorphological Researches, 3(2), 39-56. (in Persian)
  18. Khosravi, K., Nohani, E., Maroufinia, E., & Pourghasemi, H.R. (2016). A GIS-based flood susceptibility assessment and its mapping in Iran: a comparison between frequency ratio and weights-of-evidence bivariate statistical models with multi-criteria decision-making technique. Natural Hazards, 83(2), 947-987.
  19. Lee, S., & Pradhan, B. (2007). Landslide hazard mapping at Selangor, Malaysia using frequency ratio and logistic regression models. Landslides, 4(1), 33-41.
  20. Marmion, M., Hjort, J., Thuiller, W., & Luoto, M. (2008). A comparison of predictive methods in modelling the distribution of periglacial landforms in Finnish Lapland. Earth Surface Processes and Landforms, 33(14), 2241-2254,
  21. Mojaddadi, H., Pradhan, B., Nampak, H., Ahmad, N., & Ghazali, A.H.B. (2017). Ensemble machine-learning-based geospatial approach for flood risk assessment using multi-sensor remote-sensing data and GIS. Geomatics, Natural Hazards and Risk, 8(2), 1080-1102.
  22. Nouri Boroujerdi, P., & Eskandi, V. (2009) Introduction to Quantitative Studies in Management (Case Study: Data Mining in Management Studies). Quarterly Journal of Quantitative Studies in Management, 3(2) 1-13 (In Persion)
  23. Poudyal, C.P., Chang, C., Oh, H.J., & lee, S. (2010). Landslide susceptibility maps comparing frequency ratio and artificial neural networks: a case study from the Nepal Himalaya. Environmental Earth Sciences, 61(5), 1049-1064.
  24. Pourghasemi, H.R., Jirandeh, A.G., Pradhan, B., Xu, C., & Gokceoglu, C. (2013). Landslide susceptibility mapping using support vector machine and GIS at the Golestan Province, Iran. Journal of Earth System Science, 122(2), 349-369.
  25. Pourtaghi, Z.S., & Pourghasemi, H.R. (2014). GIS-based groundwater spring potential assessment and mapping in the Birjand Township, southern Khorasan Province, Iran. Hydrogeology Journal, 22(3), 643-662
  26. Rahi, G.h. (2018). Prediction of trench erosion sensitivity using spatial data mining methods. Ph.D. thesis, Faculty of Natural Resources Engineering. Sari University of Agricultural, Sciences and Natural Resources. (In Persion).
  27. Rahmati, O., Pourghasemi, H. R., & Zeinivand, H. (2015). Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province, Iran. Geocarto International, 31(1), 42-70
  28. Rahmati, O., Zeinivand, H., & Besharat, M. (2016a). Flood hazard zoning in Yasooj region, Iran, using GIS and multi-criteria decision analysis. Geomatics, Natural Hazards and Risk, 7(3), 1000-1017.
  29. Rahmati, O., Pourghasemi, H. R., & Melesse, A. M. (2016b). Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: a case study at Mehran Region, Iran. Catena, 137, 360-372.
  30. Rahmati, O., & Pourghasemi, H. R. (2017). Identification of critical flood prone areas in data-scarce and ungauged regions: A comparison of three data mining models. Water Resources Management, 31(5), 1473-1487
  31. Rotigliano, E., Martinello, C., Agnesi, V., & Conoscenti, C. (2018). Evaluation of debris flow susceptibility in El Salvador (CA): a comparison between Multivariate Adaptive Regression Splines (MARS) and Binary Logistic Regression (BLR). Hungarian Geogr. Bull, 67, 361-373.
  32. Servati, M.R., Ghahrodi Tali, M., Golkarami, A., & Njafi, E. (2014). Geomorphological thresholds for gully erosion in Kchick watershed, NE Golestan Province. Applied researches in geographical sciences, 32, 231-249, (in Persian)
  33. Tehrany, M.S., Pradhan, B. & Jebur, M.N. (2013). Spatial prediction of flood susceptible areas using rule-based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology, 504, 69-79.
  34. Tehrany, M.S., Pradhan, B., Mansor, S., & Ahmad, N. (2015). Flood susceptibility assessment using GIS-based support vector machine model with different kernel types. Catena, 125, 91-101.
  35. Vapnik, V. (1995). The Nature of Statistical Learning Theory. New York, Springer-Verlag, pp. 122.
  36. Wilson, J.P., & Gallant, J.C. (Eds). (2000). Terrain analysis: principles and applications. John Wiley and Sons.
  37. Walter, S.D. (2002). Properties of the summary receiver operating characteristic (SROC) curve for diagnostic test data. Stat Med. 21, 1237-1256.
  38. Wang, L. (2005). Support Vector Machines: Theory and Applications. New York, Springer-Verlag, pp.412.
  39. Woznicki, S.A., Baynes, J., Panlasigui, S., Mehaffey, M., & Neale, A. (2019). Development of a spatially complete floodplain map of the conterminous United States using random forest. Science of the total environment, 647, 942-953.
  40. Yalcin, A. (2008). GIS-based landslide susceptibility mapping using analytical hierarchy process and bivariate statistics in Ardesen (Turkey): comparisons of results and confirmations. Catena, 72)1), 1-12.