Super vectorizer safe

#Super vectorizer safe how to#
#Super vectorizer safe professional#

To train my models, I used textual data from the Kaggle Perspective API Challenge, which contained strings of text along with their binary labels for the six bullying categories in question. I also tried different vectorizers: the TFIDF vectorizer, the hashing vectorizer, the count vectorizer, and custom feature engineering, all in Python using scikit-learn. Within SVMs, I tried the linear, RBF, sigmoid, and polynomial kernels.

In my experiment, I tried the support vector machine (SVM) and Naive Bayes algorithms. My model consisted of the following general pipeline: loading a piece of text, turning that text into a numerical vector using a vectorizer, running the vector through a classifier algorithm, and determining if the text was toxic, severely toxic, identity attacking, insulting, threatening, obscene, or any combination thereof. Trolls, bots, and others who post toxic speech on platforms such as Facebook and Twitter not only undermine this culture of constructive dialogue but also cause harm on a personal level by triggering depression and self-harm, in severe cases.

#Super vectorizer safe professional#

Using the working definition of cyberbullying as “willful and repeated harm inflicted through the use of computers, cell phones, and other electronic devices,” I experimented with an array of machine learning models for the purpose of creating a program that could detect and address cyberbullying in real time.Īs someone who avidly uses social media for personal, academic, and professional communications as well as for informational purposes, I believe that having welcoming, safe, and constructive dialogue platforms is paramount to having fruitful conversations. The technical and non-technical guidance I received from my mentor, from AI4ALL, and from the larger network of other mentors and mentees was invaluable. I also appreciated the experiences of working through a challenging and multifaceted problem while applying technical concepts to tackle a pervasive social issue.

#Super vectorizer safe how to#

On the research front, I learned how to glean and synthesize information from technical papers, and how to run experiments, compare results, and prioritize evaluation metrics in context. On the technical front, I learned how to implement support vector machines and Naive Bayes algorithms, how to refine default machine learning models by tuning parameters, how to use Python to turn text into vectors, and how to calculate evaluation metrics such as accuracy, precision (macro and micro), and runtime.

Over the twelve weeks of the AI4ALL mentorship program, I worked with IBM’s CTO of Applied AI to create Humanly, a natural language processing system designed to detect cyberbullying in text.Ībove all, the program was a wonderful learning experience.