r - Show raw values and weighted mean for each factor level in ggplot2 -
i trying show variable (allele specific expression) different factor levels (samples) , weighted mean (weight=coverage).
i have made sample data:
set.seed(2) x <- sample(c("a","b","c"), 100, replace=t) y <- rnorm(100) w <- ceiling(rnorm(100,200,200)) df <- data.frame(x, y, w) library(ggplot2) ggplot(df, aes(x=factor(x), y=y, weight=w)) + geom_point(aes(size=w)) + stat_summary(fun.y=mean, colour="red", geom="point", size=5)
(and tried post plot - not have enough points yet).
this works fine - shows unweighted mean...
library(plyr) means <- ddply(df, "x", function(x) data.frame(wm=weighted.mean(x$y, x$w), m=mean(x$y))) means x wm m 1 0.00878432 0.11027454 2 b -0.07283770 -0.13605530 3 c -0.14233389 0.08116117
so - trying show "wm" values red dots instead - using ggplot2. think must using "weight=.." correctly - giving now...
i hope can help.
i'd create summary
data.frame mean
, weighted mean
first follows:
require(plyr) dd <- ddply(df, .(x), summarise, m=mean(y), wm=weighted.mean(y, w))
then, i'd plot loading data showing mean , weighted mean.
require(reshape2) # melt require(ggplot2) ggplot() + geom_point(data = df, aes(x=factor(x), y=y, size=w)) + geom_point(data = melt(dd, id.var="x"), aes(x=x, y=value, colour=variable), size=5) # if want remove legend "variable" scale_colour_discrete(breaks=null)
you might want consider using scale_size_area()
provide better/unbiased size value allocation.
Comments
Post a Comment