Course Finder


An attempt to guide undergraduate students choose the right course using machine learning .

Introduction Students fresh to college ,find it difficult to choose a course best suited for them .Data is collected from students currently studying undergraduate in various departments .The data mainly comprises of their interests before joining college ,that is their subjects of interest and also things like solving puzzles for fun,playing chess etc.This will form the input.The details of what they like in their department and how much they find suited it to their liking is collected. This will form the target.Mathematical models are created to map input to the target .Input data from new students is collected .Model predicts the probability of students liking a course.

Features: 10 students who like their department are chosen.They are given a set of 25 questions.(questions which indicate a probable interest in that department). Sample :Someone who like puzzles,chess,maths,likes to solve sudoku is likely to have an interest for computer science .The students rate their interest level on a scale of 1 to 5 . This forms the input feature data. (25 * 1 vector).Target details are obtained from the students ,which is the department they are studying in .

Model: A classification algorithm runs on top of the data and a model is constructed .A soft max classification algorithm ,that helps the freshers predict their probability of them liking a department.Different classification models are tried.Neural Networks,Bayesian Model are tried .A fresh set of 20 students are used for testing and validation of the model .

Neural Networks : Definition: Artificial neural networks (ANNs) are a family of statistical learning models inspired by biological neural networks (the central nervous systems of animals, in particular the brain) and are used to estimate or approximate functions that can depend on a large number of inputs and are generally unknown. BackPropagation algorithm is used to build the network .

Intuition Input layer is the input data .One hidden layer is present .Weights are used to find values in hidden layer.Values in output layer is determined using the hidden layer.Error is found in output layer.Weights are adjusted using backpropagation algorithm and error is reduced.

Bayesian Model:

Definition: A Bayesian classifier is based on the idea that the role of a (natural) class is to predict the values of features for members of that class. Examples are grouped in classes because they have common values for the features. Such classes are often called natural kinds. In this section, the target feature corresponds to a discrete class, which is not necessarily binary. The idea behind a Bayesian classifier is that, if an agent knows the class, it can predict the values of the other features. If it does not know the class, Bayes’ rule can be used to predict the class given (some of) the feature values. In a Bayesian classifier, the learning agent builds a probabilistic model of the features and uses that model to predict the classification of a new example.

Intuition: v1,v2,v3,...vk are values of feature vectors.Given that x1=v1,x2=v2 etc what is the probability that this input set belongs to a specific class .Bayes theorem.

Y: P(Y | X1=v1,...,Xk=vk) = (P(X1=v1,...,Xk=vk| Y) ×P(Y))/(P(X1=v1,...,Xk=vk)) = (P(X1=v1|Y)×···×P(Xk=vk| Y)×P(Y))/( ∑Y P(X1=v1|Y)×···×P(Xk=vk| Y) ×P(Y))