Calculating Word Lengths in Fiction & Non-Fiction Books

  • Thread starter Thread starter Natasha1
  • Start date Start date
  • Tags Tags
    Books Fiction
Click For Summary

Homework Help Overview

The discussion revolves around calculating statistical measures of word lengths in two books, one fiction and one non-fiction. Participants are exploring how to determine the mean, median, mode, and standard deviation of word lengths based on a sample size of words from each book.

Discussion Character

  • Exploratory, Assumption checking, Problem interpretation

Approaches and Questions Raised

  • Some participants suggest using programming languages like C or Tcl to automate the calculations, while others express uncertainty about programming skills. There is discussion about the feasibility of calculating statistics by hand or using software like Excel. Participants also question the appropriate sample size needed to achieve reliable statistical results.

Discussion Status

The conversation is ongoing, with various suggestions and methods being explored. Some participants have offered guidance on potential approaches, such as using random sampling techniques to select words from the texts. However, there is no explicit consensus on how to proceed, and some participants are still seeking clarification on the task.

Contextual Notes

Participants are considering the implications of word length differences between fiction and non-fiction, and there are mentions of the variability in word choice based on the nature of the texts. The original poster has requested help in understanding how to approach the problem without needing detailed arithmetic.

Natasha1
Messages
494
Reaction score
9
Take two books, of different authors, one fiction, one non-fiction. Choose a reasonable sample size of words from each (say 100 000 words for the fiction one and 80 000 words for the non-fiction one) and find the mean, median, modal word-length in each and standard deviation.
 
Physics news on Phys.org
Pretty easy. Just write a C program, or even just a Tcl script.
 
berkeman said:
Pretty easy. Just write a C program, or even just a Tcl script.

Don't know anything about programing I'm affraid.
 
Well how in the world are you supposed to calculate those stats? By hand?!

Maybe what they are asking is how few words can you use as your sample size in order to get those stats to within some amount of error?

I guess you could do it in Excel if you had to... mighty big spreadsheet, though.
 
berkeman said:
Well how in the world are you supposed to calculate those stats? By hand?!

Maybe what they are asking is how few words can you use as your sample size in order to get those stats to within some amount of error?

I guess you could do it in Excel if you had to... mighty big spreadsheet, though.

ok this his how the whole question is asked:

Any help would be much much appreciated in advance :-)

Would you expect word-length in general to differ in fiction and non-fiction books?

Take two books, of different authors, one fiction, one non-fiction. Choose a reasonable sample size of words from each, and find the mean, median, modal word-length in each and standard deviation. Make it clear how you have done the various calculations without presenting detailed arithmetic.

In the light of the figures found, comment on the initial question (no formal inferential work needed) just an informed view from the figures found.
 
Well, it obviously depends a lot on the books (some non-fiction books will by their nature use bigger words). But now that the multiple 100,000 sample size has been reduced, the task becomes much more do-able by hand.

Do you know how to calculate those statistics? Do you have a statistics calculator? If not, do you know how to use these functions in Excel? If not, do you have access to Excel? (just use the Help feature to show you how to enter the functions)

BTW, to get a good statistical sample without having to enter too many word sizes into your Excel spreadsheet, I'd use the close-the-eyes, flip-open-randomly, and poka-word technique for chosing about 40-50 words randomly from each book.
 
Well, flip it open randomly and instead of poking a word choose say the fifth word. If you poke a word you would be biased towards selecting longer words.
 
0rthodontist said:
Well, flip it open randomly and instead of poking a word choose say the fifth word. If you poke a word you would be biased towards selecting longer words.

I don't really get it. What am I suppose to do exactly? Can someone just start the problem for me to set me on the right direction :-)
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K
Replies
3
Views
4K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 21 ·
Replies
21
Views
6K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K