update
This commit is contained in:
169
storage/FUHPB4WI/.zotero-ft-cache
Normal file
169
storage/FUHPB4WI/.zotero-ft-cache
Normal file
@@ -0,0 +1,169 @@
|
||||
Skip to main content
|
||||
Elsevier Logo
|
||||
Scopus Logo
|
||||
Description for the menu
|
||||
Back
|
||||
Uncertainty-Informed Screening for Safer Solvents Used in the Synthesis of Perovskites via Language Models
|
||||
Journal of Chemical Information and ModelingArticle2025
|
||||
DOI: 10.1021/acs.jcim.5c00612
|
||||
Copy to clipboard
|
||||
Mukherjee, Arpan
|
||||
a
|
||||
;
|
||||
Giri, Deepesh
|
||||
b
|
||||
;
|
||||
Rajan, Krishna
|
||||
a
|
||||
Send mail to Rajan K.
|
||||
a
|
||||
Department of Materials Design and Innovation, University at Buffalo, Buffalo, 14260−1660, NY, United States
|
||||
Show all information
|
||||
1
|
||||
70th percentile
|
||||
Citation
|
||||
Set citation alert
|
||||
0.69
|
||||
FWCI
|
||||
More information about Field-Weighted Citation Impact
|
||||
View PDF
|
||||
Opens in a new tab.
|
||||
Full text
|
||||
Export
|
||||
Save to list
|
||||
Save to list functionality is only available if you are signed in and subscribed
|
||||
DocumentImpactCited by (1)References (78)Similar documents
|
||||
Abstract
|
||||
|
||||
Automated data curation for niche scientific topics, where data quality and contextual accuracy are paramount, poses significant challenges. Bidirectional contextual models such as BERT and ELMo excel in contextual understanding and determinism. However, they are constrained by their narrower training corpora and inability to synthesize information across fragmented or sparse contexts. Conversely, autoregressive generative models like GPT can synthesize dispersed information by leveraging broader contextual knowledge and yet often generate plausible but incorrect (“hallucinated”) information. To address these complementary limitations, we propose an ensemble approach that combines the deterministic precision of BERT/ELMo with the contextual depth of GPT. We have developed a hierarchical knowledge extraction framework to identify perovskites and their associated solvents in perovskite synthesis, progressing from broad topics to narrower details using two complementary methods. The first method leverages deterministic models like BERT/ELMo for precise entity extraction, while the second employs GPT for broader contextual synthesis and generalization. Outputs from both methods are validated through structure-matching and entity normalization, ensuring consistency and traceability. In the absence of benchmark data sets for this domain, we hold out a subset of papers for manual verification to serve as a reference set for tuning the rules for entity normalization. This enables quantitative evaluation of model precision, recall, and structural adherence while also providing a grounded estimate of model confidence. By intersecting the outputs from both methods, we generate a list of solvents with maximum confidence, combining precision with contextual depth to ensure accuracy and reliability. This approach increases precision at the expense of recall─a trade-off we accept given that, in high-trust scientific applications, minimizing hallucinations is often more critical than achieving full coverage, especially when downstream reliability is paramount. As a case study, the curated data set is used to predict the endocrine-disrupting (ED) potential of solvents with a pretrained deep learning model. Recognizing that machine learning models may not be trained on niche data sets such as perovskite-related solvents, we have quantified epistemic uncertainty using Shannon entropy. This measure evaluates the confidence of the ML model predictions, independent of uncertainties in the NLP-based data curation process, and identifies high-risk solvents requiring further validation. Additionally, the manual verification pipeline addresses ethical considerations around trust, structure, and transparency in AI-curated data sets. © 2025 The Authors. Published by American Chemical Society
|
||||
|
||||
Indexed keywords
|
||||
MeSH
|
||||
|
||||
Calcium Compounds; Oxides; Solvents; Titanium; Uncertainty
|
||||
|
||||
Engineering controlled terms
|
||||
|
||||
Data accuracy; Data consistency; Data curation; Data reliability; Deep learning; Extraction; Forecasting; Knowledge management; Learning systems; Perovskite; Solvents; Uncertainty analysis
|
||||
|
||||
EMTREE drug terms
|
||||
|
||||
calcium derivative; oxide; perovskite; solvent; titanium
|
||||
|
||||
Engineering uncontrolled terms
|
||||
|
||||
American Chemical Society; Automated data; Contextual modeling; Data curation; Data quality; Data set; Excel; Language model; Normalisation; Uncertainty
|
||||
|
||||
EMTREE medical terms
|
||||
|
||||
chemistry; synthesis; uncertainty
|
||||
|
||||
Engineering main heading
|
||||
|
||||
Economic and social effects
|
||||
|
||||
Reaxys Chemistry database information
|
||||
Reaxys is designed to support chemistry researchers at every stage with the ability to investigated chemistry related research topics in peer-reviewed literature, patents and substance databases. Reaxys retrieves substances, substance properties, reaction and synthesis data.
|
||||
Substances
|
||||
OO
|
||||
View details
|
||||
Expand Substance 4-butanolide
|
||||
Powered by
|
||||
Chemicals and CAS Registry Numbers
|
||||
|
||||
Unique identifiers assigned by the Chemical Abstracts Service (CAS) to ensure accurate identification and tracking of chemicals across scientific literature.
|
||||
|
||||
oxide 16833-27-5
|
||||
perovskite 12194-71-7, 61027-03-0
|
||||
titanium 7440-32-6
|
||||
Calcium Compounds
|
||||
Show more
|
||||
Funding details
|
||||
|
||||
Details about financial support for research, including funding sources and grant numbers as provided in academic publications.
|
||||
|
||||
Funding sponsor Funding number Acronym
|
||||
|
||||
|
||||
University at Buffalo
|
||||
|
||||
See opportunities by UB
|
||||
See opportunities (opens in new window)
|
||||
UB
|
||||
|
||||
|
||||
CoRE center
|
||||
|
||||
|
||||
|
||||
|
||||
Col-laboratory for a Regenerative Economy
|
||||
|
||||
|
||||
|
||||
|
||||
National Science Foundation
|
||||
|
||||
See opportunities by NSF
|
||||
See opportunities (opens in new window)
|
||||
2315307 NSF
|
||||
|
||||
|
||||
National Science Foundation
|
||||
|
||||
See opportunities by NSF
|
||||
See opportunities (opens in new window)
|
||||
NSF
|
||||
Funding text
|
||||
The authors acknowledge support from NSF Award No. 2315307: NSF Engines Development Award and the Col-laboratory for a Regenerative Economy (CoRE center) in the Department of Materials Design and Innovation - University at Buffalo.
|
||||
Corresponding authors
|
||||
Corresponding author K. Rajan
|
||||
Affiliation Department of Materials Design and Innovation, University at Buffalo, Buffalo, 14260−1660, NY, United States
|
||||
Email address krajan3@buffalo.edu
|
||||
|
||||
© Copyright 2025 Elsevier B.V., All rights reserved.
|
||||
|
||||
Abstract
|
||||
Indexed keywords
|
||||
Reaxys Chemistry database information
|
||||
Chemicals and CAS Registry Numbers
|
||||
Funding details
|
||||
Corresponding authors
|
||||
About Scopus
|
||||
What is Scopus
|
||||
Learn more about Scopus (opens in a new window)
|
||||
Content coverage
|
||||
Learn more about Scopus' content coverage (opens in a new window)
|
||||
Scopus blog
|
||||
Read the Scopus Blog (opens in a new window)
|
||||
Scopus API
|
||||
Learn more about Scopus API's (opens in a new window)
|
||||
Privacy matters
|
||||
View privacy matters page (opens in a new window)
|
||||
Language
|
||||
日本語版を表示する
|
||||
日本語版を表示する
|
||||
查看简体中文版本
|
||||
查看简体中文版本
|
||||
查看繁體中文版本
|
||||
查看繁體中文版本
|
||||
Просмотр версии на русском языке
|
||||
Просмотр версии на русском языке
|
||||
Customer Service
|
||||
Help
|
||||
View Scopus help files (opens in a new window)
|
||||
Tutorials
|
||||
Select to view tutorials (opens in a new window)
|
||||
Contact us
|
||||
Contact us (opens in a new window)
|
||||
Go to the Elsevier site (opens in a new window)
|
||||
Terms and conditions
|
||||
View the terms and conditions of Elsevier (opens in a new window)
|
||||
Privacy policy
|
||||
View the privacy policy of Elsevier (opens in a new window)
|
||||
Cookies settings
|
||||
View the Cookie setting preferences
|
||||
All content on this site: Copyright © 2026 Elsevier B.V.
|
||||
Go to the Elsevier site (opens in a new window)
|
||||
, its licensors, and contributors. All rights are reserved, including those for text and data mining, AI training, and similar technologies. For all open access content, the relevant licensing terms apply.
|
||||
Go to RELX Group Homepage (Opens in a new window)
|
||||
1
storage/FUHPB4WI/.zotero-reader-state
Normal file
1
storage/FUHPB4WI/.zotero-reader-state
Normal file
@@ -0,0 +1 @@
|
||||
{"scale":1,"scrollYPercent":0}
|
||||
8
storage/FUHPB4WI/105013389245.html
Normal file
8
storage/FUHPB4WI/105013389245.html
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user