What ( Implicit) Command Defines Values in Dictionary?

  • Thread starter Thread starter WWGD
  • Start date Start date
  • Tags Tags
    Implicit
AI Thread Summary
The discussion focuses on understanding how to create a dictionary in Python that counts character frequencies in a string. The user is confused about how keys and values are defined in the dictionary, specifically regarding the use of the `setdefault()` method and the naming of the dictionary variable as `count`. Suggestions include using `collections.defaultdict(int)` for easier implementation and the `+=` operator for incrementing counts. The conversation also touches on the differences between Python 2.7 and 3.7, IDE preferences, and the importance of testing code interactively to understand its functionality. Overall, the thread emphasizes clarity in code structure and the flexibility of Python's dictionary methods.
WWGD
Science Advisor
Homework Helper
Messages
7,679
Reaction score
12,460
Hi, I am looking into some code in Python 3.7.2 that counts the number of appearances of in a message string.
We are given a string. We then define an empty dictionary to be ultimately filled with the characters as the keys and the values will be (are the ) number of appearences of the characters , i.e., we will go from :
{} to : { Char1: Frequency of Char 1,..., Charn: Frequency of Char n } , where the frequency is the number of appearances of the character in the message string.
Now, I understand why we use count() and set default . QUESTION: I just don't see where we are specifying what the key and values of the dictionary are. How do we know characters will be the keys and their frequency will be the value?( Obviously this is what we want to do, but how am I communicating this to Python in the code I am writing?). Is it in the line of code : count[character]? :' :
EDIT: I am also confused at the fact that we are using the same term : count as the name of the dictionary and as the function(). Isn't 'count' a reserved word since count() is a function, or is it count() that is reserved?
EDIT2: I am also confused at the fact that the syntax for count() I am familiar with is str_i.count(str_i, 1st value, last value) and not the one in the code below.

Python:
message = 'It was a cold day in April and the clocks were striking thirteen. '
count = {}

for character in message:
     count.setdefault(character,0)
     count[character]= count[character] +1
print(count)
I ( think I ) understand (please correct me if I am wrong and/ or comment):
We start with a message string. We will count the frequency of each character in the string.
We start with an empty dictionary, which will be updated.
We use setdefault(character,0) to pratctice for cases where a character is not in the message, to avoid exceptions/errors : in case a character does not appear.

We then loop over each character in message, counting the number of appearances of each character..
 
Last edited:
Technology news on Phys.org
WWGD said:
We use setdefault(character,0) to pratctice for cases where a character is not in the message, to avoid exceptions/errors : in case a character does not appear.

An easier way to do this is to make count a collections.defaultdict(int). Then an entry will automatically be created any time a key is accessed that's not in the dictionary.

You could also use the += operator to increment the count.

Finally, your print statement has brackets instead of parentheses.

A general comment: the easiest way to see if code does what you want it to is to run it. Have you tried just typing this code at the Python prompt to see what happens? One of the nicest things about Python is how easy it is to test code this way, interactively.

(Running the code would also have shown you another typo in your code, in addition to the error in the print statement.)
 
  • Like
Likes WWGD
PeterDonis said:
An easier way to do this is to make count a collections.defaultdict(int). Then an entry will automatically be created any time a key is accessed that's not in the dictionary.

You could also use the += operator to increment the count.

Finally, your print statement has brackets instead of parentheses.

A general comment: the easiest way to see if code does what you want it to is to run it. Have you tried just typing this code at the Python prompt to see what happens? One of the nicest things about Python is how easy it is to test code this way, interactively.

(Running the code would also have shown you another typo in your code, in addition to the error in the print statement.)
Thanks. I did try it and it did run. I just want to understand well why it ran and what each of the parts does, so I will be able to write my own code at some point, hopefully soon. I corrected the brackets on the count. EDIT. I guess I can consider different ways and see if they run, but I know relatively little at this point. Still, will try.
 
where did this code come from? message ##\neq ## Message (capitalization matters), print function should use () not [] (flagged by Peter above)...

here's how I'd do it... use .get()

Code:
Message = 'It was a cold day in April and the clocks were striking thirteen. '
counter = {}

for character in Message:
     counter[character]= counter.get(character, 0) +1
print(count)
the for loop can be streamlined with a dictionary comprehension, but that may be outside the scope right now

- - - -
edit: I'm nearly certain I did an exercise like this out of Think Python years back...
great little book made freely available by the author here
https://greenteapress.com/wp/think-python/
 
  • Like
Likes WWGD
WWGD said:
how am I communicating this to Python in the code I am writing?

You're not. You're telling Python to perform certain operations referencing certain variables, but you're not telling Python the types of the objects that the variables point to. The Python interpreter figures that out on the fly, at run time, and throws an exception if you try to do something that isn't allowed by the types you're operating on.

For example, try the code with counts[character] += character instead of counts[character] += 1 and see what happens.
 
StoneTemplePython said:
here's how I'd do it... use .get()

The += operator, which I suggested, is even shorter, when combined with collections.defaultdict.

Python:
import collections

message = 'It was a cold day in April and the clocks were striking thirteen. '
counts = collections.defaultdict(int)
for character in message:
    counts[character] += 1
print(counts)
 
  • Like
Likes WWGD
StoneTemplePython said:
where did this code come from? message ##\neq ## Message (capitalization matters), print function should use () not [] (flagged by Peter above)...

here's how I'd do it... use .get()

Code:
Message = 'It was a cold day in April and the clocks were striking thirteen. '
counter = {}

for character in Message:
     counter[character]= counter.get(character, 0) +1
print(count)
the for loop can be streamlined with a dictionary comprehension, but that may be outside the scope right now

Sorry for my sloppiness, corrected 'Message' into 'message'. It comes from Ch 5 of :

https://automatetheboringstuff.com/chapter5/ , section : default(), like 30% of the way from the top.
 
  • Like
Likes StoneTemplePython
WWGD said:
Sorry for my sloppiness, corrected 'Message' into 'message'. It comes from Ch 5 of :

https://automatetheboringstuff.com/chapter5/ , section : default(), like 30% of the way from the top.
"Automate the Boring Stuff" is a great book to work through
 
StoneTemplePython said:
the for loop can be streamlined with a dictionary comprehension

collections.Counter is even easier, but before that was added to the collections module, yes, a dict comprehension is how I would have done it. But for some people that might be less readable.
 
  • #10
Sorry if this is OT, but I is IDLE as an IDE better than Anaconda? I was working with Python 2.7 in Anaconda, but for complicated reasons I switched to IDLE. It seems Anaconda packs much more of a punch, in that it seems to include more general options ( ignorant expression from newbie). Is there a major difference between 2.7 and 3.72?
 
  • #11
PeterDonis said:
The += operator, which I suggested, is even shorter, when combined with collections.defaultdict.

PeterDonis said:
collections.Counter is even easier, but before that was added to the collections module, yes, a dict comprehension is how I would have done it. But for some people that might be less readable.

It is partly a matter of taste... I haven't used collections.Counter much. There is a strange emphasis in some parts of the python community to doing lots of one liners at the cost of readability.

On the other hand, dictionary comprehensions do have a major benefit of (i) letting the hash table 'know' how big it will be (ties into collision management and speed for very large dictionaries) and (ii) the dictionary comprehension is pretty portable to other languages (e.g. almost the same code in Julia).

in any case the standard dictionary and .get() seem worth knowing -- the latter method is rather flexible e.g. for non-zero starting values.
 
  • Like
Likes WWGD
  • #12
WWGD said:
Sorry if this is OT, but I is IDLE as an IDE better than Anaconda? I was working with Python 2.7 in Anaconda, but for complicated reasons I switched to IDLE. It seems Anaconda packs much more of a punch, in that it seems to include more general options ( ignorant expression from newbie). Is there a major difference between 2.7 and 3.72?

Anaconda is actually an suitcase with a lot of stuff in it -- I mostly use anaconda distribution 3.6 or whatever 3.x item. I like the conda package manager and some other stuff. I do not like the anaconda IDE, so I don't use it... I don't really use IDLE directly either.

I prefer Pycharm or Atom for an IDE.
Atom may be a bit... not curated enough for you. Check out Pycharm free community edition.

please use Python 3.x...
 
  • Like
Likes WWGD
  • #13
Is it a good analogy to see the use of '.' as "functional composition"? i.e., count.setdefault is the composition of the count function with the default function.. And, I guess the word 'count' is not reserved despite the fact that it is a function, because we are not using count()?
 
  • #14
StoneTemplePython said:
letting the hash table 'know' how big it will be

How does that happen? AFAIK the Python dict implementation doesn't even know whether it gets invoked in a dict comprehension or in procedural code; it's the same either way.
 
  • #15
WWGD said:
Is it a good analogy to see the use of '.' as "functional composition"? i.e., count.setdefault is the composition of the count function with the default function.

I don't really think so, since "the default function" isn't really a function. In your case it's just a fixed integer.
 
  • Like
Likes WWGD
  • #16
PeterDonis said:
How does that happen? AFAIK the Python dict implementation doesn't even know whether it gets invoked in a dict comprehension or in procedural code; it's the same either way.
Not sure -- I may be conflating different languages (or even data structures) o_O
 
Last edited:
  • #17
PeterDonis said:
I don't really think so, since "the default function" isn't really a function. In your case it's just a fixed integer.
But isn't the integer an output of the function default(a,b)? Or is default a method? EDIT: It is a method.
 
  • #18
WWGD said:
EDIT: It is a method.

It is? Of what object?

setdefault is a method of the dict object. Is that what you mean?
 
  • Like
Likes WWGD
  • #19
WWGD said:
It comes from Ch 5 of :
https://automatetheboringstuff.com/chapter5/ , section : default(), like 30% of the way from the top.
btw, the author of that book did a talk at a Python meetup in the bay area about a month or two ago. You may want to check out python meetups in NY -- maybe sweigart will show up
 
  • Like
Likes WWGD
  • #20
WWGD said:
Sorry if this is OT, but I is IDLE as an IDE better than Anaconda? I was working with Python 2.7 in Anaconda, but for complicated reasons I switched to IDLE. It seems Anaconda packs much more of a punch, in that it seems to include more general options ( ignorant expression from newbie). Is there a major difference between 2.7 and 3.72?
I use Visual Studio for C, C++, and assembly programming, but when I was teaching myself Python a few years ago, I didn't use an IDE at all -- nothing more complex than Notepad + command prompt.

I don't know what the debugging capabillites are in IDLE or whatever, but Python comes with a rudimentary debugger already built in. I wrote a couple of Insights articles on this debugger a while back - https://www.physicsforums.com/insights/simple-python-debugging-pdb-part-1/ and https://www.physicsforums.com/insights/simple-python-debugging-pdb-part-2/. As you get deeper into Python programming, or for that matter, any language, getting comfortable with any debugger is an extremely important thing to do.
 
  • Like
Likes WWGD and Klystron
  • #21
PeterDonis said:
It is? Of what object?

setdefault is a method of the dict object. Is that what you mean?
Yes, sorry, that is what I meant.
 
Back
Top