Treffer: Schematic of the approach: This schematic illustrates the entire workflow of the project.
Title:
Schematic of the approach: This schematic illustrates the entire workflow of the project.
Authors:
Publication Year:
2025
Subject Terms:
Cancer, Science Policy, Space Science, Environmental Sciences not elsewhere classified, Biological Sciences not elsewhere classified, Mathematical Sciences not elsewhere classified, Information Systems not elsewhere classified, semantically meaningful representations, received less attention, python code implementing, ensuring consistent representation, combined approach aimed, cell lung carcinoma, additionally leveraged wordnet, 8 %, suggesting, reduce embedding noise, improving embedding quality, higher embedding quality, biomedical synonym replacement, biomedical concept representations, +embeddings%22">xlink "> embeddings, word2vec algorithm applied, span multiple words, mean pairwise distance, single concept identifier, biomedical concept synonyms, embedding techniques, biomedical terms, biomedical synonyms
Document Type:
Bild
still image
Language:
unknown
DOI:
10.1371/journal.pone.0322498.g001
Availability:
Rights:
CC BY 4.0
Accession Number:
edsbas.20683F47
Database:
BASE
Weitere Informationen
The process begins with initial text preprocessing using marea software to obtain the PM corpus [ 6 ]. The PM corpus is then processed through non-biomedical concept replacement, resulting in the WN corpus; to fairly assess the concept replacement proposal, both the PM and WN corpora are embedded using the same text-embedding algorithm (Word2Vec in our experiments - due to its broad usage and relative simplicity), and pairwise distances between sets of related biomedical concepts in the embedded PM corpus are compared to those in the embedded WN corpus.