R: Predictive Analysis

R: Predictive Analysis

RM 83.00

ISBN:

9781788290852

Categories:

Engineering & IT

File Size

25.71 MB

Format

epub

Language

English

Release Year

2017
Favorite (0)

Synopsis

Key FeaturesLoad, wrangle, and analyze your data using the worlds most powerful statistical programming languageFamiliarize yourself with the most common data mining tools of R, such as k-means, hierarchical regression, linear regression, Naive Bayes, decision trees, text mining and so on.We emphasize important concepts, such as the bias-variance trade-off and over-fitting, which are pervasive in predictive modelingBook DescriptionPredictive analytics is a field that uses data to build models that predict a future outcome of interest. It can be applied to a range of business strategies and has been a key player in search advertising and recommendation engines.The power and domain-specificity of R allows the user to express complex analytics easily, quickly, and succinctly. R offers a free and open source environment that is perfect for both learning and deploying predictive modeling solutions in the real world. This Learning Path will provide you with all the steps you need to master the art of predictive modeling with R.We start with an introduction to data analysis with R, and then gradually youll get your feet wet with predictive modeling. You will get to grips with the fundamentals of applied statistics and build on this knowledge to perform sophisticated and powerful analytics. You will be able to solve the difficulties relating to performing data analysis in practice and find solutions to working with “messy data”, large data, communicating results, and facilitating reproducibility. You will then perform key predictive analytics tasks using R, such as train and test predictive models for classification and regression tasks, score new data sets and so on. By the end of this Learning Path, you will have explored and tested the most popular modeling techniques in use on real-world data sets and mastered a diverse range of techniques in predictive analytics.This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:Data Analysis with R, Tony FischettiLearning Predictive Analytics with R, Eric MayorMastering Predictive Analytics with R, Rui Miguel ForteWhat you will learnGet to know the basics of Rs syntax and major data structuresWrite functions, load data, and install packagesUse different data sources in R and know how to interface with databases, and request and load JSON and XMLIdentify the challenges and apply your knowledge about data analysis in R to imperfect real-world dataPredict the future with reasonably simple algorithmsUnderstand key data visualization and predictive analytic skills using RUnderstand the language of models and the predictive modeling processAbout the AuthorTony Fischetti is a data scientist at College Factual, where he gets to use R everyday to build personalized rankings and recommender systems. He graduated in cognitive science from Rensselaer Polytechnic Institute, and his thesis was strongly focused on using statistics to study visual short-term memory. Tony enjoys writing and contributing to open source software, blogging at http://www.onthelambda.com, writing about himself in third person, and sharing his knowledge using simple, approachable language and engaging examples. The more traditionally exciting of his daily activities include listening to records, playing the guitar and bass (poorly), weight training, and helping others.Eric Mayor is a senior researcher and lecturer at the University of Neuchatel, Switzerland. He is an enthusiastic user of open source and proprietary predictive analytics software packages, such as R, Rapidminer, and Weka. He analyzes data on a daily basis and is keen to share his knowledge in a simple way.Rui Miguel Forte is currently the chief data scientist at Workable. He was born and raised in Greece and studied in the UK. He is an experienced data scientist who has over 10 years of work experience in a diverse array of industries spanning mobile marketing, health informatics, education technology, and human resources technology. His projects include the predictive modeling of user behavior in mobile marketing promotions, speaker intent identification in an intelligent tutor, information extraction techniques for job applicant resumes, and fraud detection for job scams. Currently, he teaches R, MongoDB, and other data science technologies to graduate students in the business analytics MSc program at the Athens University of Economics and Business. In addition, he has lectured at a number of seminars, specialization programs, and R schools for working data science professionals in Athens. His core programming knowledge is in R and Java, and he has extensive experience working with a variety of database technologies, such as Oracle, PostgreSQL, MongoDB, and HBase. He holds a masters degree in electrical and electronic engineering from Imperial College London and is currently researching machine learning applications in information extraction and natural language processing.Table of ContentsRefresheRThe Shape of DataDescribing RelationshipsProbabilityUsing Data to Reason About the WorldTesting HypothesesBayesian MethodsPredicting Continuous VariablesPredicting Categorical VariablesSources of DataDealing with Messy DataDealing with Large DataReproducibility and Best PracticesVisualizing and Manipulating Data Using RData Visualization with LatticeCluster AnalysisAgglomerative Clustering Using hclust()Dimensionality Reduction with Principal Component AnalysisExploring Association Rules with AprioriProbability Distributions, Covariance, and CorrelationLinear RegressionClassification with k-Nearest Neighbors and Naive BayesClassification TreesMultilevel AnalysesText Analytics with RCross-validation and Bootstrapping Using Caret and Exporting Predictive Models Using PMMLExercises and SolutionsFurther Reading and ReferencesGearing Up for Predictive ModelingLinear RegressionLogistic RegressionNeural NetworksSupport Vector MachinesTree-based MethodsEnsemble MethodsProbabilistic Graphical ModelsTime Series AnalysisTopic ModelingRecommendation SystemsBibliography