Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Strings in python

  1. Apr 15, 2015 #1

    BiGyElLoWhAt

    User Avatar
    Gold Member

    I'm revisiting python, because recently our physics department decided to adopt it as our computational language, and I missed the class. I've also been given the impression that it's a very good language for algorithms and what-not.

    According to https://docs.python.org/3/tutorial/introduction.html
    This seems so useless!

    I'm really wanting to try to screw around with some simple AI, via writing a program that rewrites itself with new variables generated via input (for later reference).

    I'm anticipating the need to alter strings based on new input. Any work arounds? I went through the string methods library and couldn't find anything I thought would be of use:
    https://docs.python.org/3/library/stdtypes.html#string-methods

    Perhaps a way to generate a new string, copy the contents over, delete the old, copy them back to a string with the same name as the original but with the necessary alterations.

    Would I be better off just going with something along the lines of string_2 = string_1[:5] + 'new information' +string[5:]
    ?

    Also, I haven't figured out exactly how I'm going to implement a lot of these things, so if you have other idea's, feel free to toss them my way :smile:

    For right now the goal is to come up with a program that takes an input as a series of tokens, parses them into a 'command' object (or array, I don't know if python actually has objects), then executes the command which "alters" the program itself.

    Thanks!
     
    Last edited by a moderator: May 7, 2017
  2. jcsd
  3. Apr 15, 2015 #2

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    There are methods that allow the contents of the original string to be modified and placed in a new string. str.replace is one. Those methods should let you do whatever you need to do.
     
  4. Apr 15, 2015 #3

    robphy

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    The section immediately after that line says:
    Possibly enlightening...
    http://stackoverflow.com/questions/...rings-immutable-best-practices-for-using-them
     
    Last edited by a moderator: May 7, 2017
  5. Apr 15, 2015 #4

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    To the contrary!

    There's a lot, a whole lot, to be said in favor of immutability. You specifically mentioned AI in the opening post. Lisp, prolog, haskell, ocaml, and erlang are very widely used languages in the AI community. One of the things these languages share in common is a very strong concept of immutable objects. Another thing they share in common is the concept of functional programming.

    Python, C#, and java (and now c++) just touch on immutability and functional programming. None of these are particularly powerful AI languages. Immutability and functional programming go hand-in-hand, and both are central concepts in powerful AI languages.
     
  6. Apr 16, 2015 #5

    BiGyElLoWhAt

    User Avatar
    Gold Member

    @robphy I know what the next line said. I mentioned making a new variable in the opening post.

    @D H I might have used AI a bit loosely. This is a very recent interest of mine, and what I have in mind for this program is more about mutating itself. I can see one arguement for immutability, and that's the fact that some information should be immutable, but what I'm curious about is does it need to be immutable? Or could the program solve a recursive algorithm to decide what should be mutable or not (using the state/"experience" variables defined in the program to determine which variables to modify, keep and discard based on new input)

    Something maybe medium term that I'd be interested in doing is (keeping along the lines of the op, programs writing programs) pattern recognition, but with the program "learning" about the problem solving process by referencing the thinking skills acquired through previous successful recognitions.

    When the pattern recognizing function gets called, the program would reference its algorithm bank, which would be regularly updated with new algorithm from the patterns. The thinking skills I'm talking about would essentially (in this basic program) be patterns between patterns such as the relativity of the functions y=x and y=2x .

    I think I have some relatively descent pseudocode (it seems great in my head haha), but there are some details I still need to work out (mostly specifics of the functions and how to implement them)
     
  7. Apr 16, 2015 #6

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    Ah. Sorry. I missed that. So your complaint is about the hassle of changing a string while keeping the string name unchanged. That may be true. I believe that Python is currently oversold. There are several things regarding strings that are tedious to do in Python.
     
  8. Apr 16, 2015 #7

    BiGyElLoWhAt

    User Avatar
    Gold Member

    Thats what I'm gathering unfortunately. But c and java aren't good enough according to the physics department, we need python. I figured messing with something like this would be a fun way to relearn this from highschool.
     
  9. Apr 16, 2015 #8
    I don't understand the problem here with immutable strings. Lets say you did something like
    Code (Text):
    a = "hello"
    b = "world"
    a = a + b
    print a
    You'll see "helloworld"

    Now if you have something that was like this
    Code (Text):
    1. a = "hello"
    2. b = a
    3. a = "world"
    4. print b
     
    You'll see "hello" because 'b' was given a reference to the string 'a' had a reference to at line 2, and at line 3, 'a' was given a reference to a different string.

    At no point are the original strings modified (at least that is the conceptual model, various optimizations could be happening under the covers), but you get the effect you want.

    Note that I do most of my programming in 'C', and almost every string operation requires me to explicitly determine the length of the final string, allocate that memory (handling errors), and then copy character by character the input strings/characters into the space I just allocated [all of which is a huge potential source of random crashes and security vulnerabilities]. While there are benefits to how C does it (because it lets you control so much of the process you can really optimize building a large string composed from many difference parameters), most of the time you just want what Python does.
     
  10. Apr 16, 2015 #9

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    This is not correct.

    It is the string that is immutable, not the variable. What about the variable to which the string is assigned? It's variable. In fact, it's very variable. You can assign a string to a python variable, then a tuple, then a list, then an integer, and finally a dict. Demo:
    Code (Text):

    python
    >>> a = "hello"
    >>> type(a)
    <type 'str'>
    >>> a = (1,2)
    >>> type(a)
    <type 'tuple'>
    >>> a = [1,2]
    >>> type(a)
    <type 'list'>
    >>> a = 1
    >>> type(a)
    <type 'int'>
    >>> a = {'this':'that','foo':'bar'}
    >>> type(a)
    <type 'dict'>
     
  11. Apr 16, 2015 #10

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    That's good to know.
     
  12. Apr 17, 2015 #11

    BiGyElLoWhAt

    User Avatar
    Gold Member

    Hmm... Assuming you ripped that from idle, I guess I'm confused as to what an immutable string is, then. What exactly is immutable? The data in the stack?

    So if I define a string and assign a variable to it, say a = "hello", there are some bytes with a location and a binary value equivalent to hello. Now defining 'a' to a new value, a = "world", there are now some bytes with a new location and a binary value equivalent to world. Hello no longer has a reference, and therefore is inaccessable due to the fact that the program no longer has a reference to those bytes. Am I understanding string immutability correctly? If so, how is this useful? Hello is just occupying space in the RAM, and for a large program, that seems like a bad idea.

    This idea reminds me of the JASS language (for Warcraft III world editor); when you loop commands, you get these things that are called memory leaks, and for long duration games (~large programs for python) you have to bypass the GUI and hard code the removal of these previous commands, otherwise the game will eventually use all your available memory and will cause a crash (either the program or the computer, I'm not really sure, I've never made that large of a map). Is this also an example of pseudo immutability? All it does in JASS is cause problems that you have to work around. I think they do it because in the GUI there are commands you can issue that reference previous commands (You have events, conditions, and actions, and conditions can be things such as "a unit = triggering unit"). Now this would be useful for a period of time, but after a while, i.e. after my program can no longer reference the triggereing unit comparison and get the same unit, that information should be dropped.

    Wouldn't it make sense to drop this string "hello" once the pointer's direction is relocated? What if I did something like this:
    a="hello"
    print a
    a = "world"
    print a
    a= "hello"
    print a

    the output should be
    hello
    world
    hello

    but the second hello, will it have the same location as the first? If not, what methods are available to handle this?
     
  13. Apr 17, 2015 #12
    The Python system automatically handles this and there won't be any memory leaks.

    Some of the benefits of 'immutability' (not just for strings) is: functions can't modify the data itself (no side effects), data is thread safe, easier to cache, and it simplifies everything for the for the language user.
     
  14. Apr 18, 2015 #13
    If you want a mutable array, do this:

    import array
    myarray=array.array('c', "hello")

    If you later need a string with the same data as in the array, do this:
    mystring=myarray.tostring()
     
  15. Apr 30, 2015 #14
    Forget about words like stack and heap. Python abstracts all that away, and you simply don't need to know. The interpreter does a lot of 'magic' and it's not at all close to the metal. Unless you count the VM itself, cPython doesn't even have a stack. Everything you create is an object (even integers) and is stored on the private heap.

    Immutable means the value cannot be changed. To change it, you assign a new one with the value you require.

    This is nothing to do with immutability, which is simply that the value can't be changed.

    Don't worry about cleaning up no-longer used variables. The garbage collector will pick them up so it won't leak. Python will automatically free them once it sees they are no-longer used. You only really need to take care to close things such as file handles, which are much more critical to return (they are fewer in number, and an opened file is locked and inaccessible to other programs). Python has the 'with' context manager for dealing with resources such as file handles.

    Only worry about RAM if you start to run out (it's possible, the garbage collector isn't perfect).

    It is. You know you could try that in the intepreter, pretty trivially :-)

    If you want to fully understand how assignment and variables work in python, see here: http://nedbatchelder.com/text/names.html

    Be aware that it's not a straight value or reference like it is in C.

    Idiomatically the recommendation for programming in Python is to always assign, and never mutate, unless the cost of assignment is too high. The reason being some things are, and some thing aren't, mutable, and you can't always keep track of which is which due to duck typing. Therefore always assign, and your code will do what you think it does. I've seen too many bugs relating to mutating and returning from a function so I second the idiomatic recommendation.

    If you care about performance and memory usage, use Fortran for your scientific code. If you want to get the job done fast and go home early, use Python. Python does have a kickass set of mathematical and scientific libraries too (scipi, numpy, pandas etc).
     
    Last edited: Apr 30, 2015
  16. May 1, 2015 #15

    D H

    User Avatar
    Staff Emeritus
    Science Advisor

    Python (and Matlab and perl and lua and ruby and ...) let you think and work at a higher level. There's a cost to be paid; scripting languages are not exactly fast. That what you wrote is a bit bloated memory-wise, a bit sluggish time-wise: Sometimes that's no big deal. You got the job done, it works. Time to move on to the next problem.

    Even in the cases where those performance and memory penalties are too much to bear, it oftentimes still pays off to start in python or some other scripting language. My first guess at a solution is typically completely wrong. I'll throw it out and start over again, and rinse and repeat until I get it right. If I had started off in a compiled language from the start, I would still be working on bad_idea_number_1 in the time it took me to write bad_idea_number_1.py, bad_idea_number_2.py, ..., not_so_bad.py, getting_close.py, nailed_it.py. Now all that's left is translating that slow and bloated nailed_it.py to nailed_it.cpp or nailed_it.f.


    That is indeed the output. Try it. The answer to your question about whether the address of the second hello will be the same as the first is "maybe", but you shouldn't care.

    Demo:
    Code (Text):
    python
    >>> a = "hello"
    >>> id(a)
    4412112304
    >>> a = "world"
    >>> id(a)
    4412112400
    >>> a = "hello"
    >>> id(a)
    4412112304
    In this case, it's the same.

    Code (Text):
    python
    >>> a = "hello"
    >>> a
    'hello'
    >>> id(a)
    4541693360
    >>> a = "world"
    >>> a
    'world'
    >>> id(a)
    4541693600
    >>> a = "hello"
    >>> a
    'hello'
    >>> id(a)
    4541693696
    And now it's different.

    Perhaps the garbage collector kicked in somewhere between a="world" and the second a="hello" and collected the first "hello". Perhaps something else happened. It really doesn't matter which.
     
  17. May 1, 2015 #16

    BiGyElLoWhAt

    User Avatar
    Gold Member

    Yea that wasnt a typical example i was worried about, just a simplified example of what i was trying to understand. Ill check out thet link here soon @Carno Raar
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Strings in python
  1. Error on Python (Replies: 9)

  2. Chess with Python (Replies: 2)

  3. Is Python the future? (Replies: 10)

  4. Python installation (Replies: 10)

Loading...