Project by
Chinmay,Simrat,Kumkum
on
on
Week 6: Working on UDpipe
Activities performed in this week
- Download all the drug HTML files from the drugbank site.
- Write code to extract all the descriptions from the HTML files and save it in a file.
- Pass the drug list to the UDPIPE to get lematized drug names.
- From the output of UDPIPE generate the list of lemmatized drug names. EG:- paracitamols -> paracitamol.
- Generate 1-hot encoding of the drug drug pair data and pass it to SVM and train it.
- Pass the 1-hoit encoding of drug names and the surrounding words and pass it to SVM.
- Modify Java Script to extract all the sentences from training and test XML file into a .txt file