- Premium Academic Help From Professionals
- +1 323 471 4575
- [email protected]

Descriptive Statistics and Statistical Distributions Using R

Assignment ID Number AFFGEHU83939HD Type of Document Essay Writing Format APA/MLA/Harvard Academic Level Masters/University References/Sources 4 References

Instructions:Descriptive Statistics and Statistical Distributions Using R

GIS 470 Statistics for Geographers Assignment #2: Descriptive Statistics and Statistical Distributions Using R

Introduction This assignment is an opportunity to continue to develop your skills in reporting descriptive statistics and in using the software package R. As in assignment #1, you will be analyzing three variables of your choosing from the class survey data for the first few questions. Please do your best to apply the principles and techniques we have learned in class. You are welcome to use your notes or any educational resources available on the internet, but please ensure that your output and interpretation are your original work. This assignment will continue beyond our exploration of descriptive statistics into exploration of statistical distributions. The flexibility and open-source nature of R means that there are many possible ways to generate the correct answers to each question. Be sure to document your work as best you can to maximize your opportunities to earn points. Comments in your R scripts will help you, and those evaluating your work, understand what you are trying to accomplish!

How to complete this assignment R and RStudio software can be freely downloaded from the internet. Instructions and download files can be found here: https://www.rstudio.com/ and in many other locations. The software packages are also available on many ASU computers. If you have any difficulty accessing the software, please contact the instructor or TA. Recall that most installations of RStudio require R also to be downloaded onto the machine, although you will only need to interact with RStudio to complete this assignment.

The best way to ensure you receive full credit on this assignment is to work from this document as a template. To provide answers to the questions below, you can copy and paste output from the RStudio command window, or from a .csv file that you output using script (write.csv). Figures can be copied and pasted, or saved, from the plots window in RStudio.

There are six questions that total 100 points, plus opportunities for bonus points.

All students will need to upload two files to Blackboard:

- Their completed assignment worksheet (this document, with your content added)
- A saved version of the R Script that contains all commands you used for your analysis (.R file)

This assignment is due by 5pm on Monday, March 26Good luck!

[3 points]Identify three variables you would like to analyze from the class survey data. (Recall that a clean version of the class survey data, with metadata, is available on Blackboard). You should select one categorical variable, one ordinal variable, and one continuous variable. Try to pick variables that you think will have interesting differences between the groups of the categorical variable you select. Please do not use more than one of the three in-class variables (Taco Bell/Chipotle, Spiciness Preference, and Number of Countries) in your selection. You may use the same three variables you examined in assignment #1, so long as they meet the requirements above.

Variable TypeVariable NameCategorical Note: it will be easier to complete this assignment if you choose a categorical variable with 2-5 categories. Ordinal Note: it will be easier to complete this assignment if you choose an ordinal variable with 2-5 categories. Continuous

[10 points]Download and open the “DSandProb_Lab.R” script that is available on the Blackboard site. You will be making modifications to this script to examine your variables of interest. Please pay careful attention to all of the elements of the script for each line of the code you are working on, such that the names of variables, numbers of categories, labels for tables and figures, etc., are appropriate for your data.Open the existing R script, adjust the working directories, and load the class survey data set. Also run the lines of code that activate the

ggplot2, e1071,anddplyrlibraries. (You may need to install the associated packages if they are not already on your machine).Did you load all three libraries successfully?

When you initially read in the survey data from the .csv file, how many rows and columns did the data set have?

Save your script with a new file name. Type the file name you used for your R script in the box below.

Create a subset of the survey data that only contains the columns relevant for your analysis. Remove extra rows from the initial read in of the data that only contain “NA” values.

After you created your subset of the survey data and remove empty rows, how many rows and columns did the data set have?

[24 points]For each of your three variables, generate a table of summary statistics (either a frequency table or a table of descriptive statistics) and an appropriate visual representation of the data (either a bar graph or histogram). On each graph, adjust at least one of the aesthetic elements (such as the bar color or font choice) to something other than the default setting. Ensure that the plots are appropriately labeled. If generating a histogram, ensure that the binning scheme is logical.Frequency tables can be generated using either the “dplyr” or “table” options from the code. If you are generating a table of descriptive statistics, please show the mean, median, standard deviation, skewness, and interquartile range.

Categorical variable:

Copy and paste your summary statistics table here (you can copy directly from the command window to receive full credit, or earn a bonus point by exporting your table as a .csv file and then copying and pasting in a table from Excel or other software):

Insert your graph below. Ensure that the labels on the graph are appropriate for your variable!

Ordinal variable:

Copy and paste your summary statistics table here (you can copy directly from the command window to receive full credit, or earn a bonus point by exporting your table as a .csv file and then copying and pasting in a table from Excel or other software):

Insert your graph below. Ensure that the labels on the graph are appropriate for your variable!

Continuous variable:

Copy and paste your summary statistics table here (you can copy directly from the command window to receive full credit, or earn a bonus point by exporting your table as a .csv file and then copying and pasting in a table from Excel or other software). For this variable, be sure that your descriptive statistics include (at least) the mean, median, standard deviation, and skewness.

What is the interval of values that is within one standard deviation of the mean? (e.g., “-5.4 to +3.2”)

What is the interval of values that is within two standard deviations of the mean? (e.g., “-10.7 to +9.9”)

Insert your graph below. Ensure that the labels on the graph are appropriate for your variable!

**Now would be a great time to re-save your script**

[20 points]Examine how the descriptive statistics of your continuous and ordinal variables vary across the groups of your categorical variable.First, generate a table that shows the descriptive statistics of your continuous variable

withineach of the groups of your categorical variable. You can copy directly from the command window to receive full credit, or earn a bonus point by exporting your table as a .csv file and then copying and pasting in a table from Excel or other software. Ensure that your table has appropriate column headers. Insert the table below.Next, generate a split histogram that shows the distribution of the continuous variable

withineach of the groups of your categorical variable. Ensure that your figure has appropriate labels. Insert the split histogram below.Next, generate a frequency table that shows the counts of responses for each of the ordinal variable categories

withineach of the groups of your categorical variable. You can copy directly from the command window to receive full credit, or earn a bonus point by exporting your table as a .csv file and then copying and pasting in a table from Excel or other software. Ensure that your table has appropriate column headers. Insert the table below.Finally, generate a split bar graph that shows the frequency of responses for the ordinal variable

withineach of the groups of your categorical variable. Ensure that your figure has appropriate labels. Insert the split bar graph below.

[21 points]We’re now going to revisit our work with the binomial distribution. We will continue with our example where we are estimating the probability that a student who is making random guesses on a multiple choice exam answers a certain number of questions correctly. Be sure to follow the code carefully and make adjustments as needed to ensure you are calculating the probabilities correctly for the parameters you specify below. This will be especially important when creating the blank matrix for yourfor loopin the last step!Let’s set some parameters for our work.

Choose a number of choices that are available on each question (2 through 5):

Divide one by the number above. This is the probability of success on each trial (variable

min the code):

On what day of the month (1-31) does your birthday fall?

Let’s use that number as the number of questions on the exam (the number of trials, variable

n)Assume that a student needs to get at least 50% of the questions right to pass the exam. How many questions do they need to get correct?

How many combinations are possible to get the number of questions correct you specified above given the total number of questions

n? (For example, there are 252 ways to get 5 questions right on a 10 question exam).

Use the binomial probability distribution formula to estimate the probability that they get EXACTLY the needed number of questions correct. What is that probability?

Next, use a

for loopto estimate the probability that a randomly guessing student gets AT LEAST the needed number of questions correct. What is that probability?

[22 points]We’ll wrap up the assignment with a look at the count data you collected a few weeks ago. As you recall, many “true” count processes can be modeled by the Poisson distribution.Read in the “GIS470_CountData_Spr2018.csv” file that is available on Blackboard. This file contains one column for each student and one row for each of the 20 observations you made. (If you did not complete the assignment, you can use another student’s data, but be sure that you understand what the data are that they collected). Recall that the metadata are also available on Blackboard by following the link to “Feb 26 Class Activities Spreadsheet.”

Which column/variable will you be examining?

What do the values in this column represent?

First, create a simple bar graph that shows the number of occurrences you recorded in each interval over the ten minute time period. Insert the graph below. (Note: there is not a template code for generating this bar graph at this point in the script – you may have to borrow some ideas from earlier in the code, or use your well-honed skills of searching the internet for guidance).

Next, create a histogram that has the probability of each count occurring on the vertical axis (rather than the count itself – use the example provided in the code). Insert that histogram below.

Recall that only one parameter is needed to specify the Poisson distribution. What is that parameter?

What is the value of that parameter for your variable of interest?

Generate a histogram for a pure Poisson distribution that has the parameter you specified above. The x-axis for your histogram should span the range of values that you observed in your data (in the version of the script you downloaded, the range is set from 0 to 12, but this might not be appropriate for your data). Insert that histogram below.

How well do the data you collected match what would be expected for data that meet the assumptions for a Poisson distribution? If the distributions are not particularly well matched, what are some factors that might have contributed to that discrepancy?

[necessary to receive credit]Be sure to complete the following actions before closing R:

- Save your R script
- Save this word document
- Upload both of those documents to Assignment #2 on Blackboard.
Please use intuitive file names.

[1 bonus point if you make a prediction]This assignment will be scored out of 100 total points. How many points do you think you earned?

RUBRIC

QUALITY OF RESPONSENO RESPONSEPOOR / UNSATISFACTORYSATISFACTORYGOODEXCELLENTC ontent (worth a maximum of 50% of the total points)Zero points: Student failed to submit the final paper.20 points out of 50: The essay illustrates poor understanding of the relevant material by failing to address or incorrectly addressing the relevant content; failing to identify or inaccurately explaining/defining key concepts/ideas; ignoring or incorrectly explaining key points/claims and the reasoning behind them; and/or incorrectly or inappropriately using terminology; and elements of the response are lacking.30 points out of 50: The essay illustrates a rudimentary understanding of the relevant material by mentioning but not full explaining the relevant content; identifying some of the key concepts/ideas though failing to fully or accurately explain many of them; using terminology, though sometimes inaccurately or inappropriately; and/or incorporating some key claims/points but failing to explain the reasoning behind them or doing so inaccurately. Elements of the required response may also be lacking.40 points out of 50: The essay illustrates solid understanding of the relevant material by correctly addressing most of the relevant content; identifying and explaining most of the key concepts/ideas; using correct terminology; explaining the reasoning behind most of the key points/claims; and/or where necessary or useful, substantiating some points with accurate examples. The answer is complete.50 points: The essay illustrates exemplary understanding of the relevant material by thoroughly and correctly addressing the relevant content; identifying and explaining all of the key concepts/ideas; using correct terminology explaining the reasoning behind key points/claims and substantiating, as necessary/useful, points with several accurate and illuminating examples. No aspects of the required answer are missing.Use of Sources (worth a maximum of 20% of the total points).Zero points: Student failed to include citations and/or references. Or the student failed to submit a final paper.5 out 20 points: Sources are seldom cited to support statements and/or format of citations are not recognizable as APA 6^{th}Edition format. There are major errors in the formation of the references and citations. And/or there is a major reliance on highly questionable. The Student fails to provide an adequate synthesis of research collected for the paper.10 out 20 points: References to scholarly sources are occasionally given; many statements seem unsubstantiated. Frequent errors in APA 6^{th}Edition format, leaving the reader confused about the source of the information. There are significant errors of the formation in the references and citations. And/or there is a significant use of highly questionable sources.15 out 20 points: Credible Scholarly sources are used effectively support claims and are, for the most part, clear and fairly represented. APA 6^{th}Edition is used with only a few minor errors. There are minor errors in reference and/or citations. And/or there is some use of questionable sources.20 points: Credible scholarly sources are used to give compelling evidence to support claims and are clearly and fairly represented. APA 6^{th}Edition format is used accurately and consistently. The student uses above the maximum required references in the development of the assignment.Grammar (worth maximum of 20% of total points)Zero points: Student failed to submit the final paper.5 points out of 20: The paper does not communicate ideas/points clearly due to inappropriate use of terminology and vague language; thoughts and sentences are disjointed or incomprehensible; organization lacking; and/or numerous grammatical, spelling/punctuation errors10 points out 20: The paper is often unclear and difficult to follow due to some inappropriate terminology and/or vague language; ideas may be fragmented, wandering and/or repetitive; poor organization; and/or some grammatical, spelling, punctuation errors15 points out of 20: The paper is mostly clear as a result of appropriate use of terminology and minimal vagueness; no tangents and no repetition; fairly good organization; almost perfect grammar, spelling, punctuation, and word usage.20 points: The paper is clear, concise, and a pleasure to read as a result of appropriate and precise use of terminology; total coherence of thoughts and presentation and logical organization; and the essay is error free.Structure of the Paper (worth 10% of total points)Zero points: Student failed to submit the final paper.3 points out of 10: Student needs to develop better formatting skills. The paper omits significant structural elements required for and APA 6^{th}edition paper. Formatting of the paper has major flaws. The paper does not conform to APA 6^{th}edition requirements whatsoever.5 points out of 10: Appearance of final paper demonstrates the student’s limited ability to format the paper. There are significant errors in formatting and/or the total omission of major components of an APA 6^{th}edition paper. They can include the omission of the cover page, abstract, and page numbers. Additionally the page has major formatting issues with spacing or paragraph formation. Font size might not conform to size requirements. The student also significantly writes too large or too short of and paper7 points out of 10: Research paper presents an above-average use of formatting skills. The paper has slight errors within the paper. This can include small errors or omissions with the cover page, abstract, page number, and headers. There could be also slight formatting issues with the document spacing or the font Additionally the paper might slightly exceed or undershoot the specific number of required written pages for the assignment.10 points: Student provides a high-caliber, formatted paper. This includes an APA 6^{th}edition cover page, abstract, page number, headers and is double spaced in 12’ Times Roman Font. Additionally, the paper conforms to the specific number of required written pages and neither goes over or under the specified length of the paper.## GET THIS PROJECT NOW BY CLICKING ON THIS LINK TO PLACE THE ORDER

CLICK ON THE LINK HERE:https://termhomework.com/orders/ordernowYou Can Also Place the Order In www.termhomework.com/orders/ordernow / www.essaysolver.com/orders/ordernow

Do You Have Any Other Essay/Assignment/Class Project/Homework Related to this? Click Here Now[CLICK ME]and Have It Done by Our PhD Qualified Writers!!

Tired of getting an average grade in all your school assignments, projects, essays, and homework? Try us today for all your academic schoolwork needs. We are among the most trusted and recognized professional writing services in the market.

We provide unique, original and plagiarism-free high quality academic, homework, assignments and essay submissions for all our clients. At our company, we capitalize on producing A+ Grades for all our clients and also ensure that you have smooth academic progress in all your school term and semesters.

High-quality academic submissions, A 100% plagiarism-free submission, Meet even the most urgent deadlines, Provide our services to you at the most competitive rates in the market, Give you free revisions until you meet your desired grades and Provide you with 24/7 customer support service via calls or live chats.

error: Content is protected !!

Open chat

Hello.

You can contact our live agent via WhatsApp! Via our number +1 323 471 4575.

Feel Free To Ask Questions, Clarifications, or Discounts, Available When Placing the Order.

You can contact our live agent via WhatsApp! Via our number +1 323 471 4575.

Feel Free To Ask Questions, Clarifications, or Discounts, Available When Placing the Order.