王希铭,钟丽锦,丁隆真,等.基于大语言模型的环保举报数据提升管理效率研究[J].中国环境管理,2025,17(6):106-115.
WANG Ximing,ZHONG Lijin,DING Longzhen,et al.Research on Improving Management Efficiency of Environmental Complaints Based on Large Language Model[J].Chinese Journal of Environmental Management,2025,17(6):106-115.
基于大语言模型的环保举报数据提升管理效率研究
Research on Improving Management Efficiency of Environmental Complaints Based on Large Language Model
DOI:10.16868/j.cnki.1674-6252.2025.06.106
中文关键词:  环保举报  大语言模型  文本挖掘  公众参与  公众科学  环境管理
英文关键词:environmental complaint  large language model  text mining  public participation  citizen science  environmental management
基金项目:
作者单位E-mail
王希铭 南方科技大学土壤污染防治与安全全国重点实验室, 广东深圳 518055
南方科技大学环境科学与工程学院, 广东深圳 518055 
 
钟丽锦 北京环丁环保大数据研究院, 北京 100083  
丁隆真 南方科技大学土壤污染防治与安全全国重点实验室, 广东深圳 518055
南方科技大学环境科学与工程学院, 广东深圳 518055 
 
吕广丰 北京环丁环保大数据研究院, 北京 100083  
齐兴育 北京环丁环保大数据研究院, 北京 100083  
胡清 南方科技大学土壤污染防治与安全全国重点实验室, 广东深圳 518055
南方科技大学环境科学与工程学院, 广东深圳 518055 
huq@sustech.edu.cn 
摘要点击次数: 199
全文下载次数: 263
中文摘要:
      环保举报是公众参与环境治理的关键渠道之一,公众通过环保举报提供了大量非结构化的文本数据。然而,传统结构化标签分析方法难以有效利用文本信息,限制了从环保举报中发现潜在环境问题的能力。本研究基于2016—2021年全国567 985条环保举报数据,引入DeepSeek大语言模型对非结构文本进行挖掘,系统比较其与结构化标签对污染类型识别的表现。研究发现:①非结构文本可提升污染类型识别准确性,非结构文本性能较高(F1得分中位数: 0.92),优于结构化标签(F1得分中位数: 0.50);②非结构文本可降低易感知污染类型的感知偏差(关联规则置信度差异下降9.6%),更真实地反映公众关切的实际环境问题;③非结构文本对时空特征更敏感,有利于发现公众关切变化的早期信号。案例分析表明,该方法对养殖废水和工业废气的识别时间分别提前约4年与2年,并揭示出公众对噪声类问题日益上升的关注趋势。本研究提供了一种快速识别环保举报非结构化文本中公众隐性关切环境问题的基于大语言模型的新方法,可辅助管理部门快速响应公众需求、推动精细化管理。
英文摘要:
      Environmental complaints provide a vital channel for public participation in environmental governance, through which the public provides a large volume of unstructured textual data. However, traditional methods relying on structured labels cannot effectively process this information, limiting the capacity to identify potential environmental issues from these complaints. Based on 567,985 national environmental complaint records from 2016 to 2021, this study introduces the DeepSeek large language model to mine unstructured text and systematically compares its performance with structured labels in identifying pollution types. The findings reveal that: ① Unstructured text improves the accuracy of pollution-type identification, achieving a higher median F1-score (0.92) compared to structured labels (0.50); ② Unstructured text reduces perceptual bias toward easily perceived pollution types, with a 9.6% decrease in association rule confidence difference, thereby more accurately reflecting the actual environmental concerns of the public;③ Unstructured text is more sensitive to spatiotemporal features, enabling early detection of shifts in public concerns. Case studies indicate that this method advanced the identification of aquaculture wastewater and industrial exhaust gas by approximately four and two years, respectively, and revealed a growing public focus on noise pollution. This study establishes a novel LLM-based approach for rapidly identifying latent environmental concerns from unstructured text, supporting regulators in swiftly responding to public needs and promoting refined governance.
HTML  查看全文  查看/发表评论  下载PDF阅读器
关闭