[英]Graphing data that is read using readHTMLTable

I want to read the following table , from a webpage then create a bargraph.

Language............ Jobs

PHP.................... 12,664

Java................... 12,558

Objective C......... 8,925

SQL.................... 5,165

Android (Java).... 4,981

Ruby................... 3,859

JavaScript........... 3,742

C#....................... 3,549

C++..................... 1,908

ActionScript......... 1,821

Python................. 1,649

C.......................... 1,087

ASP.NET............... 818

My questions:

1.The problem that my bars get messed up and each bar does correspond to the correct language The following is my code:

tables2 <-(readHTMLTable("http://www.sitepoint.com/best-programming-language-of-2013/",which=1))
  2. Since I am a beginner at R I would like to know in what format does readHTMLTable save the data in? is it a matrix, data frame or other format?

The main problem here is that Jobs is being read as a factor. Because of the commas in that field, you can't do a direct numeric conversion. You can find out what 'format' your object is in R by doing str(). Here str(tables2) gives:

'data.frame':   13 obs. of  2 variables:
 $ Language: Factor w/ 13 levels "ActionScript",..: 10 7 9 13 2 12 8 5 6 1 ...
 $ Jobs    : Factor w/ 13 levels "1,087","1,649",..: 6 5 12 11 10 9 8 7 4 3 ...

So you can see Jobs is a factor, and that tables2 is a data.frame. To convert it to numeric you need to remove the commas. You can do that with gsub().

tables2$Jobs <- as.numeric(gsub(",","",tables2$Jobs))

Now str(tables2) gives:

'data.frame':   13 obs. of  2 variables:
 $ Language: Factor w/ 13 levels "ActionScript",..: 10 7 9 13 2 12 8 5 6 1 ...
 $ Jobs    : num  12664 12558 8925 5165 4981 ...

and when you do your plot, all should be well:



