How to aggregate categorical data in R?How to sum a variable by group?Quickly reading very large tables as dataframesGrouping functions (tapply, by, aggregate) and the *apply familyShow % instead of counts in charts of categorical variablesDrop data frame columns by nameHow to make a great R reproducible exampleHow to assign colors to categorical variables in ggplot2 that have stable mapping?data.table vs dplyr: can one do something well the other can't or does poorly?Aggregating mixed data by factor columnWhy does pandas grouping-aggregation discard categoricals column?

how do we prove that a sum of two periods is still a period?

Car headlights in a world without electricity

In Bayesian inference, why are some terms dropped from the posterior predictive?

What is required to make GPS signals available indoors?

Can someone clarify Hamming's notion of important problems in relation to modern academia?

Is there a hemisphere-neutral way of specifying a season?

What is this scratchy sound on the acoustic guitar called?

How dangerous is XSS

What is the fastest integer factorization to break RSA?

Pact of Blade Warlock with Dancing Blade

What is a Samsaran Word™?

How does a dynamic QR code work?

How to coordinate airplane tickets?

Sums of two squares in arithmetic progressions

How can I deal with my CEO asking me to hire someone with a higher salary than me, a co-founder?

How to compactly explain secondary and tertiary characters without resorting to stereotypes?

How seriously should I take size and weight limits of hand luggage?

How to aggregate categorical data in R?

Do creatures with a speed 0ft., fly 30ft. (hover) ever touch the ground?

Is this draw by repetition?

How could indestructible materials be used in power generation?

What does the same-ish mean?

Getting extremely large arrows with tikzcd

When handwriting 黄 (huáng; yellow) is it incorrect to have a disconnected 草 (cǎo; grass) radical on top?

How to aggregate categorical data in R?

How to sum a variable by group?Quickly reading very large tables as dataframesGrouping functions (tapply, by, aggregate) and the *apply familyShow % instead of counts in charts of categorical variablesDrop data frame columns by nameHow to make a great R reproducible exampleHow to assign colors to categorical variables in ggplot2 that have stable mapping?data.table vs dplyr: can one do something well the other can't or does poorly?Aggregating mixed data by factor columnWhy does pandas grouping-aggregation discard categoricals column?

I have a dataframe which consists of two columns with categorical variables (Better, Similar, Worse). I would like to come up with a table which counts the number of times that these categories appear in the two columns.
The dataframe I am using is as follows:

 Category.x Category.y
1 Better Better
2 Better Better
3 Similar Similar
4 Worse Similar

I would like to come up with a table like this:

 Category.x Category.y
Better 2 2
Similar 1 2
Worse 1 0

How would you go about it?

asked 5 hours ago

Daniel

664

4

Looks like you need table(df1)

– akrun
5 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
5 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1[] <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
4 hours ago

add a comment |

 Category.x Category.y
1 Better Better
2 Better Better
3 Similar Similar
4 Worse Similar

I would like to come up with a table like this:

 Category.x Category.y
Better 2 2
Similar 1 2
Worse 1 0

How would you go about it?

asked 5 hours ago

Daniel

664

4

Looks like you need table(df1)

– akrun
5 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
5 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1[] <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
4 hours ago

add a comment |

 Category.x Category.y
1 Better Better
2 Better Better
3 Similar Similar
4 Worse Similar

I would like to come up with a table like this:

 Category.x Category.y
Better 2 2
Similar 1 2
Worse 1 0

How would you go about it?

asked 5 hours ago

Daniel

664

 Category.x Category.y
1 Better Better
2 Better Better
3 Similar Similar
4 Worse Similar

I would like to come up with a table like this:

 Category.x Category.y
Better 2 2
Similar 1 2
Worse 1 0

How would you go about it?

r aggregate

asked 5 hours ago

Daniel

664

asked 5 hours ago

Daniel

664

asked 5 hours ago

Daniel

664

asked 5 hours ago

Daniel

664

asked 5 hours ago

Daniel

664

4

Looks like you need table(df1)

– akrun
5 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
5 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1[] <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
4 hours ago

add a comment |

4

Looks like you need table(df1)

– akrun
5 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
5 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1[] <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
4 hours ago

Looks like you need table(df1)

– akrun
5 hours ago

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
5 hours ago

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1[] <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
4 hours ago

add a comment |

3 Answers
3

active

oldest

votes

As mentioned in the comments, table is standard for this, like

table(stack(DT))

 ind
values Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

table(value = unlist(DT), cat = names(DT)[col(DT)])

 cat
value Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

with(reshape(DT, direction = "long", varying = 1:2), 
 table(value = Category, cat = time)
)

 cat
value x y
 Better 2 2
 Similar 1 2
 Worse 1 0

answered 4 hours ago

Frank

55.9k660135

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))
# Category.x Category.y
#Better 2 2
#Similar 1 2
#Worse 1 0

answered 4 hours ago

d.b

20.5k41949

add a comment |

One dplyr and tidyr possibility could be:

df %>%
 gather(var, val) %>%
 count(var, val) %>%
 spread(var, n, fill = 0)

 val Category.x Category.y
 <chr> <dbl> <dbl>
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

It, first, transforms the data from wide to long format, with column "var" including the variable names and column "val" the corresponding values. Second, it counts per "var" and "val". Finally, it spreads the data into the desired format.

Or with dplyr and reshape2 you can do:

df %>%
 mutate(rowid = row_number()) %>%
 melt(., id.vars = "rowid") %>%
 count(variable, value) %>%
 dcast(value ~ variable, value.var = "n", fill = 0)

 value Category.x Category.y
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

edited 3 hours ago

answered 4 hours ago

tmfmnk

3,6561516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
4 hours ago

Please see the updated post for commentary.

– tmfmnk
4 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55479506%2fhow-to-aggregate-categorical-data-in-r%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

As mentioned in the comments, table is standard for this, like

table(stack(DT))

 ind
values Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

table(value = unlist(DT), cat = names(DT)[col(DT)])

 cat
value Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

with(reshape(DT, direction = "long", varying = 1:2), 
 table(value = Category, cat = time)
)

 cat
value x y
 Better 2 2
 Similar 1 2
 Worse 1 0

answered 4 hours ago

Frank

55.9k660135

add a comment |

As mentioned in the comments, table is standard for this, like

table(stack(DT))

 ind
values Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

table(value = unlist(DT), cat = names(DT)[col(DT)])

 cat
value Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

with(reshape(DT, direction = "long", varying = 1:2), 
 table(value = Category, cat = time)
)

 cat
value x y
 Better 2 2
 Similar 1 2
 Worse 1 0

answered 4 hours ago

Frank

55.9k660135

add a comment |

As mentioned in the comments, table is standard for this, like

table(stack(DT))

 ind
values Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

table(value = unlist(DT), cat = names(DT)[col(DT)])

 cat
value Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

with(reshape(DT, direction = "long", varying = 1:2), 
 table(value = Category, cat = time)
)

 cat
value x y
 Better 2 2
 Similar 1 2
 Worse 1 0

answered 4 hours ago

Frank

55.9k660135

As mentioned in the comments, table is standard for this, like

table(stack(DT))

 ind
values Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

table(value = unlist(DT), cat = names(DT)[col(DT)])

 cat
value Category.x Category.y
 Better 2 2
 Similar 1 2
 Worse 1 0

with(reshape(DT, direction = "long", varying = 1:2), 
 table(value = Category, cat = time)
)

 cat
value x y
 Better 2 2
 Similar 1 2
 Worse 1 0

answered 4 hours ago

Frank

55.9k660135

answered 4 hours ago

Frank

55.9k660135

answered 4 hours ago

Frank

55.9k660135

answered 4 hours ago

Frank

55.9k660135

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))
# Category.x Category.y
#Better 2 2
#Similar 1 2
#Worse 1 0

answered 4 hours ago

d.b

20.5k41949

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))
# Category.x Category.y
#Better 2 2
#Similar 1 2
#Worse 1 0

answered 4 hours ago

d.b

20.5k41949

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))
# Category.x Category.y
#Better 2 2
#Similar 1 2
#Worse 1 0

answered 4 hours ago

d.b

20.5k41949

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))
# Category.x Category.y
#Better 2 2
#Similar 1 2
#Worse 1 0

answered 4 hours ago

d.b

20.5k41949

answered 4 hours ago

d.b

20.5k41949

answered 4 hours ago

d.b

20.5k41949

answered 4 hours ago

d.b

20.5k41949

add a comment |

One dplyr and tidyr possibility could be:

df %>%
 gather(var, val) %>%
 count(var, val) %>%
 spread(var, n, fill = 0)

 val Category.x Category.y
 <chr> <dbl> <dbl>
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

Or with dplyr and reshape2 you can do:

df %>%
 mutate(rowid = row_number()) %>%
 melt(., id.vars = "rowid") %>%
 count(variable, value) %>%
 dcast(value ~ variable, value.var = "n", fill = 0)

 value Category.x Category.y
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

edited 3 hours ago

answered 4 hours ago

tmfmnk

3,6561516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
4 hours ago

Please see the updated post for commentary.

– tmfmnk
4 hours ago

add a comment |

One dplyr and tidyr possibility could be:

df %>%
 gather(var, val) %>%
 count(var, val) %>%
 spread(var, n, fill = 0)

 val Category.x Category.y
 <chr> <dbl> <dbl>
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

Or with dplyr and reshape2 you can do:

df %>%
 mutate(rowid = row_number()) %>%
 melt(., id.vars = "rowid") %>%
 count(variable, value) %>%
 dcast(value ~ variable, value.var = "n", fill = 0)

 value Category.x Category.y
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

edited 3 hours ago

answered 4 hours ago

tmfmnk

3,6561516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
4 hours ago

Please see the updated post for commentary.

– tmfmnk
4 hours ago

add a comment |

One dplyr and tidyr possibility could be:

df %>%
 gather(var, val) %>%
 count(var, val) %>%
 spread(var, n, fill = 0)

 val Category.x Category.y
 <chr> <dbl> <dbl>
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

Or with dplyr and reshape2 you can do:

df %>%
 mutate(rowid = row_number()) %>%
 melt(., id.vars = "rowid") %>%
 count(variable, value) %>%
 dcast(value ~ variable, value.var = "n", fill = 0)

 value Category.x Category.y
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

edited 3 hours ago

answered 4 hours ago

tmfmnk

3,6561516

One dplyr and tidyr possibility could be:

df %>%
 gather(var, val) %>%
 count(var, val) %>%
 spread(var, n, fill = 0)

 val Category.x Category.y
 <chr> <dbl> <dbl>
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

Or with dplyr and reshape2 you can do:

df %>%
 mutate(rowid = row_number()) %>%
 melt(., id.vars = "rowid") %>%
 count(variable, value) %>%
 dcast(value ~ variable, value.var = "n", fill = 0)

 value Category.x Category.y
1 Better 2 2
2 Similar 1 2
3 Worse 1 0

edited 3 hours ago

answered 4 hours ago

tmfmnk

3,6561516

edited 3 hours ago

answered 4 hours ago

tmfmnk

3,6561516

answered 4 hours ago

tmfmnk

3,6561516

answered 4 hours ago

tmfmnk

3,6561516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
4 hours ago

Please see the updated post for commentary.

– tmfmnk
4 hours ago

add a comment |

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
4 hours ago

Please see the updated post for commentary.

– tmfmnk
4 hours ago

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
4 hours ago

Please see the updated post for commentary.

– tmfmnk
4 hours ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Bsrhrki

3 Answers
3

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

Post as a guest

Popular posts from this blog

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

3 Answers
3

3 Answers
3

3 Answers
3