Solution Details: After reading the training and test datasets with two instances of the CSV Reader node, we used the Decision Tree Learner node to train a decision tree classifier, and the Decision Tree Predictor node to apply it over the test data in order to assess its performance. By doing this, we achieved an accuracy of about 94%. Solution Summary: Using the learner-predictor paradigm, we trained a decision tree classifier over the training data and assessed its performance over the test data. When training the decision tree, we used Gini index as a metric for the quality of the decision tree, pruned it using the MDL method, and kept at least 6 records per node. Note 3: Need more help to understand the problem? Check this blog post out. Note 2: In this challenge, do not change the statistical distribution of any attribute or class in the datasets, and use all available attributes. A simple solution should consist of 5 nodes. You are expected to just apply a decision tree classifier (and get an accuracy of about 92%). Note 1: This challenge is a simple introduction to predictive problems, focusing on classification. You should train the decision tree classifier with the training data, and assess its quality over the test data (calculate the accuracy, precision, recall, and confusion matrix for example). The company gives you two datasets (training and test), both with many attributes and the class ‘Churn’ to be predicted (value 0 corresponds to customers that do not churn, and 1 corresponds to those who do). To this end, you are expected to use a decision tree classifier. To increase the visibility of your solution, also post it to this challenge thread on the KNIME forum.ĭescription: A telecom company wants you to predict which customers are going to churn (that is, are going to cancel their contracts) based on attributes of their accounts. Remember to upload your solution with tag justknimeit-24 to your public space on the KNIME Hub. In the meantime, feel free to discuss your work on the KNIME forum or on social media using the hashtag #justknimeit. Our solution will appear here next Tuesday. Note 3: Need more help to understand the problem? Check this blog post out.ĭataset: Training and Test Data in the KNIME Hub What model should you train over the training dataset to obtain this accuracy over the test dataset? Can this decision be automated? Note 1: A simple, automated solution to this challenge consists of 5 nodes. Again, the target class to be predicted is Churn (value 0 corresponds to customers that do not churn, and 1 corresponds to those who do). One of your colleagues said that she was able to achieve a bit over 95% accuracy for the test data without modifying the training data at all, and using all given attributes exactly as they are. Participants will gain the essential skills to design, build, verify and test predictive models.Challenge 24: Modeling Churn Predictions - Part 2ĭescription: Just like in last week’s challenge, a telecom company wants you to predict which customers are going to churn (that is, going to cancel their contracts) based on attributes of their accounts. During the class learners will acquire new skills to apply predictive algorithms to real data, evaluate, validate and interpret the results without any pre requisites for any kind of programming. Predicting future trends and behaviors allows for proactive, data-driven decisions. Machine Learning methods will be presented by utilizing the KNIME Analytics Platform to discover patterns and relationships in data. Participants will receive the basic training in effective predictive analytic approaches accompanying the growing discipline of Data Science without any programming requirements. The Code Free Data Science class is designed for learners seeking to gain or expand their knowledge in the area of Data Science.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |