About the webserver

To use AOP-helpFinder service, please enter your email address. Your email will only be used for the service and will be erased as soon as you received your results. If you want to run a second search you will have to re-submit your email address.

After a couple of minutes, you will received an email. Please click on the link to access to AOP-helpFinder web service formular.

  • Upload your files
  • You should upload two separate file, one with the stressors and one with the events.
    Both files should be in a .txt or .TXT format, with a maximum of 10MB (please contact us if you have bigger files), and no special characters. For each file, please use one line by information (one event by line, or one stressor by line, even for the synonyms as they will be searches independently.
    You can also load an example stressor file and event file
  • Output format option
  • Please, choose the format for your results files. We propose 2 types of format : .tsv and .txt
    Note that if you choose the .txt you will also have the full abstracts.(.tsv;.tsv with abstracts;.txt) please read the section about the method for more details.
  • Refinement filter option
  • The tool can refine the searches by combining a deletion of sentences containing context words with a lemmatization process. Lemmatization is a machine learning method for text normalization used in NLP that considers the context and converts the word to its meaningful base form. This option is very useful when terms from the event list have common stems (e.g. tests, testis -> test), and therefore the stemming process may lead to incorrect meanings and spelling errors.
    For more details of the refinement option, please read the method section below and the supplementary material.
  • Reduced search option
  • The tool can perform the searches in the full abstracts or without considering the beginning part, which appears to be covered usually by the first 20% of the abstracts. This option allows to avoid too many false positive, as the introduction part often reflect a working hypothesis.
    For more details of the refinement option, please read the method section below and the supplementary material

    AOP-helpFinder is provided without any warranty. But if you have any problemes please feel free to contact us by mail.

After some time (it can be some hours depending of your input data and the queue), you will receive an email with a link. Follow it to access to your results.
You can download your result files as a .zip archive. Please note that it will be automatically deleted after one month.

About the method

The two input files are in text format (txt). One containing the list of the stressors (one per line, if synonyms, one line per synonyms)(stressor example), and the second file containing the biological events of interest ( events example )

The NCBI API is used to retrieve abstracts from PubMed database(Kanz J.). To facilitate the run and optimize the time, the full PubMed database has been localy downloaded and is updated before each use of AOP-helpFinder web tool. For each stressor of interest, a file containing all the abstracts, that mention this stressor, in a .xml format is created. Then, an adaptation of the titipata PubMed Parser (Achakulvisut et al.2020) is used to extract the information of interest.
Kans J. Entrez Direct: E-utilities on the Unix Command Line. 2013 Apr 23 [Updated 2021 Apr 29]. In: Entrez Programming Utilities Help [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2010-. available from: https://www.ncbi.nlm.nih.gov/books/NBK179288/
Achakulvisut et al., (2020). Pubmed Parser: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset XML Dataset.Journal of Open Source Software, 5(46), 1979, https://doi.org/10.21105/joss.01979

  • Pre-processing
  • A pre-processing multi-steps procedure will allow to prepare all extracted abstracts from the PubMed database. First the abstracts are cleaned : abstracts are split in sentences, we remove the negative sentences then the stop words (a, and, the ...). Finally, a stemmatization process is performed on each word.
  • Score method using the position of the event
  • A score is calculated for each pre-processed abstract using the position of the identified event in the text. This was developed as stressor-event are usually co-mentioned at the beginning of the abstract to refer to a working hypothesis, whereas when they are co-mentioned at the end it is more likely to be related to results and findings ("reduced search" option)
  • Score method using graph theory
  • For each pre-processed abstract the algorithm will look for the events sequentially (using the use list of events). If the tool finds at least 3/4 of the words from one event, it computes a distance score based on Dijkstra algorithm. In this version the grammatical context is taken into account with lemmatization and work context ( see ‘refinement filter’ option).
For more details, please read this link and github

Due to a large number of false positive at the beginning of the abstracts, AOP-helpFinder provide a reduced search option to skip the beginning of the abstracts. We advise the user to apply a limit ignoring the first 20 % of the abstracts, corresponding mostly to the introduction part of the study. Indeed, ignoring the first 20% of the abstracts allows to significantly improve the results, increasing the precision by about 10%, while keeping more than 95% of the information, and reducing the noise by 15%.

The AOP-helpFinder web tool will automatically screen the abstracts mentioning the stressor of interest to retrievee known associations with the selected events. Therefore, according to the input data (especially when searching for broad event type such as 'metabolism' or with common stem 'test' and 'testis'), a huge number of results will be retrieved or may lead to incorrect meanings and spelling errors.
A new module called "refinement option" has been developed to facilitate further analysis of the results, that can be for ex. by manual curation. This refinement option will only applied on the selected abstracts from the pre-processing run. As in the pre-processing step, the refinement option will split the abstracts into sentences, remove the negative sentences and the stop words. But instead of stemming the words, the option will use the lemmatization process that considers the context and converts the word to its meaningful base form. Then, AOP-helpFinder tool will removes the sentences containing "context word" (ex: suggest,... ).Then the scores as described in "3.Pre-processing and scoring" will be computed. This step is proposed as an option as it is more time consuming.