Project by
Chinmay,Simrat,Kumkum
on
on
Week 5: Dataset Building and ML
Activities performed in this week
- Identify whether GENIA would work for identification of drugs[NER Task] by installing and passing some sample sentences to it.
- Set up and install UDPIPE.
- Write code to modify the output of UDPIPE to annotate drug names and generate a training file for udpipe.
- Implemented bag of words on the train and test files containing pair of drugs and their interactions as ‘true’ or ‘false’ using genism library in python.
- Used the vectors generated from step one to train and test SVM (linear kernel) model and got test accuracy of around 99% as the same pairs are present in test and train sets.
- Started with implementation of CNN’s for Word2Vec