For example, how does weight, in carats, affect the price? Or how does the quality of the color, or of the diamond's clarity, affect the price? These kinds of questions, where we're looking for interesting relationships among attributes using the observations we have, are common, almost universal, across data analysis. Let's say that we as scientists are interested in understanding the relationship between those attributes. Here we get a description of the diamonds dataset, and the details about each of the columns. You can find out what each of these mean using the "help" function: help ( diamonds ) And then we have other attributes including the price of the diamond. Here we have the carat: that's the weight of the diamond and the cut, color and clarity: each of these are measuring something about the quality of the diamond in various levels. Here we have a view of it kind of like a spreadsheet. Once we've loaded the diamonds dataset, we can view it using View: View ( diamonds ) See that we've added "diamonds" to our global environment. We can access it using the data function: data ( "diamonds" ) ggplot2 comes with some data available to use as a demonstration: particularly, the "diamonds" dataset, containing information about several attributes of 54000 diamonds. Or you can go to the Tools->Install Packages menu, where here you type "ggplot2" and hit install.Įach time you reopen R, you need to load the library using the library function before you use it. You can do that with one line of R code here in your interactive terminal, which is: install.packages ( "ggplot2" )Īnd hit return. So, ggplot2 is a third party package: that means it's code that doesn't come built into the language. We will assume you are moderately familiar with basic concepts in R, including variables and functions, and with RStudio, the integrated development environment for programming in R. I'm David Robinson, and in this lesson we'll introduce you to ggplot2, a powerful R package that produces data visualizations easily and intuitively. This lets you understand the basic nature of the data, so that you know what tests you can perform, and where you should focus your analysis. When you start analyzing data in R, your first step shouldn't be to run a complex statistical test: first, you should visualize your data in a graph. In data analysis more than anything, a picture really is worth a thousand words.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |