Center for Social Research Univercity of Notre Dame, 2013. – 42 p. – ISBN: N/A
The purpose of this document is to provide a conceptual introduction to statistical or machine learning (ML) techniques for those that might not normally be exposed to such approaches during their required typical statistical training. Machine learning can be described as a form of a statistics, often even utilizing well-known nad familiar techniques, that has bit of a different focus than traditional analytical practice in the social sciences and other disciplines. The key notion is that flexible, automatic approaches are used to detect patterns within the data, with a primary focus on making predictions on future data.
If one surveys the number of techniques available in ML without context, it will surely be overwhelming in terms of the sheer number of those approaches and also the various tweaks and variations of them. However, the specifics of the techniques are not as important as more general concepts that would be applicable in most every ML setting, and indeed, many traditional ones as well. While there will be examples using the R statistical environment and descriptions of a few specific approaches, the focus here is more on ideas than application3 and kept at the conceptual level as much as possible. However, some applied examples of more common techniques will be provided in detail.