简体   繁体   English

使用 R 的气泡图 - x 轴变量不是数字顺序 + 轴缩放

[英]Bubble Chart using R - x axis variables are not in numerical order + Scaling of axis

I have the following R code which contains some dummy data.我有以下 R 代码,其中包含一些虚拟数据。 I am trying to create a bubble chart where the size of the bubble is dependent on the amount and is positioned based on the profitability (as a % of the amount) on the x-axis and the volatility (as a % of the amount) on the y-axis.我正在尝试创建一个气泡图,其中气泡的大小取决于金额,并根据 x 轴上的盈利能力(金额的百分比)和波动率(金额的百分比)进行定位在 y 轴上。 Code is as follows:代码如下:

 rio_csv <- import("~/Desktop/R/Dummy Data.csv") 

# Select columns to go into df

df <- data.frame("Volpc" = rio_csv[,6],"Profitpc"= rio_csv[,5],"Amount"= rio_csv[,4])

#Plot Bubble Chart

plot <- ggplot(df, aes(x = Profitpc, y = Volpc, size = Amount)) + 
geom_point(alpha = 0.2) + scale_size(range = c(5,15)) + xlab("Profitability %") + 
ylab("Volatility %")

plot

The profitability measure on the x-axis is a percentage and the volatility on the y-axis is a percentage. x 轴上的盈利能力度量是百分比,y 轴上的波动率是百分比。 They both have the data type 'character'.它们都具有数据类型“字符”。

My first problem is when i run the code a bubble chart appears but the x-axis is not in numerical order, the y-axis is in numerical order.我的第一个问题是当我运行代码时会出现气泡图,但 x 轴不是数字顺序,y 轴是数字顺序。

I tried to use the code df$Profitpc <- as.numeric(df$Profitpc) but this causes all the values in the column to go NA with the warning 'NAs introduced by coercion'.我尝试使用代码 df$Profitpc <- as.numeric(df$Profitpc) 但这会导致列中的所有值都变为 go NA,并带有警告“强制引入的 NA”。

Is there a way of ordering the x-axis so it is in numerical order (increasing)?有没有一种方法可以对 x 轴进行排序,使其按数字顺序(递增)?

My second problem is that the scaling of both axes are not suitably scaled.我的第二个问题是两个轴的缩放都没有适当地缩放。 Ideally i would like a situation where both axes have appropriate scaling such as 0 to the max % value.理想情况下,我希望两个轴都具有适当的缩放比例,例如 0 到最大 % 值。 Is there a way to do this also?有没有办法做到这一点? I am sorry if this is obvious.如果这很明显,我很抱歉。 I have attached a picture of the chart to illustrate the issues.我附上了图表的图片来说明这些问题。 在此处输入图像描述

You've given us your code but not your data, so this isn't a simple self-contained example or a reprex.您向我们提供了您的代码,但没有提供您的数据,因此这不是一个简单的独立示例或代表。 [See this post for more advice on how to give us what we need to help you.] [有关如何向我们提供帮助您所需的更多建议,请参阅此帖子。]

However, from the symptoms you describe, I'm guessing that df$Profitpc contains values such as 27.0% .但是,根据您描述的症状,我猜测df$Profitpc包含诸如27.0%之类的值。 That's why as.numeric() fails: it doesn't know how to handle the % .这就是as.numeric()失败的原因:它不知道如何处理% So your solution is to reformat your input data so that df$Profitpc truly is a numeric.因此,您的解决方案是重新格式化您的输入数据,以便df$Profitpc真正是一个数字。 Then the graph will behave as you want.然后图表将按照您的意愿运行。 As you haven't given us your input data, you're on your own when it comes to doing that...由于您没有向我们提供您的输入数据,因此您只能靠自己...

Personally, I'd make the same change to df$Volpc as well.就个人而言,我也会对df$Volpc进行相同的更改。 As you've discovered, it's only luck that has presented the data in the order you want it.正如您所发现的,按照您想要的顺序呈现数据只是运气。 Once you've got numeric data (and as a result, the order of display that you want), you can use features of ggplot to format its appearance the way you want.一旦你得到了数字数据(因此,你想要的显示顺序),你可以使用ggplot的特性来按照你想要的方式格式化它的外观。

The lesson here is that it is important to separate the derivation of your data from its presentation.这里的教训是,将数据的派生与其呈现分开是很重要的。

I second @Limey.我第二个@Limey。 Still what you could try is check whether Profitpc is a factor and if yes convert it to character like this:您仍然可以尝试检查Profitpc是否是一个因素,如果是,则将其转换为如下character

ggplot(df, aes(x = as.character(Profitpc), y = (Volpc), size = Amount)) + 
  geom_point(alpha = 0.2) + scale_size(range = c(5,15)) + xlab("Profitability %") + 
  ylab("Volatility %") 

Still does not guarantee that the order will be right, therefore I would also convert the variables to numeric variables.仍然不能保证顺序正确,因此我还将变量转换为数值变量。 You could use parse_number() from the readr package like this:您可以像这样使用阅读器 package 中的readr parse_number()

ggplot(df, aes(x = parse_number(Profitpc), y = parse_number(Volpc), size = Amount)) + 
  geom_point(alpha = 0.2) + scale_size(range = c(5,15)) + xlab("Profitability %") + 
  ylab("Volatility %") 

Data数据

df <- tibble::tribble(
        ~Profitpc,   ~Volpc, ~Amount,
            "10%",    "30%",     10L,
         "15.50%",    "20%",     15L,
            "81.40%", "80.30%",      6L,
         "50%",  "30.3&",     12L
        )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM