Elementary Python Questions: Data Frames, k-nary functions

In summary: The general pattern for defining a function without a prespecified argument list isdef f(*args, **kwargs): print(args) print(kwargs)Calling f with arbitrary arguments will put the values into a list called args or a dict called kwargs, depending on whether you named the arguments or not. f(1, 2, 3, a = 4, b = 5) would print out:args: 1, 2, 3, 4, 5kwargs: a = 4, b = 5
  • #1
WWGD
Science Advisor
Gold Member
7,328
11,174
Hi All,
A couple of questions, please:

1) Say df is a dataframe in Python Pandas, and I select a specific column from df:
Y=df[column].values.
What kind of data structure is Y?

2)
I want to find the sum of two numbers:
Def Sum(a=0,b=0):
return a+b

If I want to find a sum over sum data structure ( say a list) , how can
I define sum, i.e., how to extend it from a binary operation? Should I use
recursion and/or some 'for' clause?

Thanks.
 
Technology news on Phys.org
  • #2
Here’s a tutorial on pandas and it shows inuts, floats and strings for columns in a dataframe.

https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm

Tuples can be used as well as other data types in python. The best way to find out though is to try it yourself.

the key point is a column must use the same data type through all it’s rows although that may not be strictly true either.

its simliar to a sql table schema.

when using .values its sugested to use .to_numpy() instead.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.values.html

so you could extract a column into a numpy array and use numpy functions to sum it or get stats...
 
Last edited:
  • Like
Likes WWGD
  • #3
WWGD said:
If I want to find a sum over sum data structure ( say a list) , how can
I define sum

You don't have to, the Python built-in sum function takes any iterable as an argument.
 
  • Like
Likes WWGD
  • #4
PeterDonis said:
You don't have to, the Python built-in sum function takes any iterable as an argument.
In this context, I suspect @WWGD wants a function that returns a list whose ith element is a[i]+b[i], which isn't what the builtin sum function does with a list-type object. Builtin sum returns the sum of all elements in one list, and fails with multiple lists.

Assuming I'm understanding the problem correctly, I think you want something like
Python:
def sum(a, b):
    if isinstance(a, (list, tuple)) and isinstance(b, (list, tuple)):
        return [sum(ai, bi) for (ai, bi) in zip(a, b)]
    else:
        return a + b
What that does is check if a and b are lists or tuples. If they are it adds them element by element, calling itself so it can handle lists-of-lists. Otherwise it just tries adding them.

I don't know if that's the most efficient way to do things. You would also want to add sense checking - for example, what would you want the behaviour to be if the lists are different lengths? And you may want to replace the isinstance calls with something better suited to your application. And you may not want the recursive behaviour with lists of lists.
 
Last edited:
  • Like
Likes WWGD
  • #5
Ibix said:
In this context, I suspect @WWGD wants a function that returns a list whose ith element is a[i]+b[i], which isn't what the builtin sum function does with a list-type object. Builtin sum returns the sum of all elements in one list, and fails with multiple lists.

Assuming I'm understanding the problem correctly, I think you want something like
Python:
def sum(a,b):
    if isinstance(a,(list,tuple)) and isinstance(b,(list,tuple)):
        return [sum(ai,bi) for (ai,bi) in zip(a,b)]
    else:
        return a+b
What that does is check if a and b are lists or tuples. If they are it adds them element by element, calling itself so it can handle lists-of-lists. Otherwise it just tries adding them.

I don't know if that's the most efficient way to do things. You would also want to add sense checking - for example, what would you want the behaviour to be if the lists are different lengths? And you may want to replace the isinstance calls with something better suited to your application. And you may not want the recursive behaviour with lists of lists.
Thank you. I was actually trying to define a k-nary function ##(x_1, x_2,...,x_n) \rightarrow x_1+x_2+...+x_n

This is trivial for 2 terms:
Def sum(x_1, x_2):
returns x_1+x_2

But I was hoping to define a sum over, say, a list, or maybe the values in a dictionary, etc and could not think of a way of defining it. Thought I would need a 'for' clause somewhere but not clear otherwise.

I tried using it to define variance but I got an error message on not being able to iterate on floats.
 
  • #6
In that case, Peter's answer is correct about the builtin sum function. If a is a list, sum(a) will return the sum of the elements of the list, and sum(x1, x2, ..., xn) will return the sum of the n variables.

The general pattern for defining a function without a prespecified argument list is
Python:
def f(*args, **kwargs):
    print(args)
    print(kwargs)
Calling f with arbitrary arguments will put the values into a list called args or a dict called kwargs, depending on whether you named the arguments or not. f(1, 2, 3, a = 4, b = 5) would make args = [1, 2, 3] and kwargs a two-element dictionary with keys "a" and "b" and corresponding values 4 and 5.

You don't have to specify both of *args and **kwargs if you only need one. As far as I'm aware the names args and kwargs are merely conventional and the asterisks are the important things, but I don't think I've ever seen anyone use anything other than args and kwargs.
 
Last edited:
  • Like
Likes WWGD
  • #7
Oh, and
Python:
def f(a, b, *args, **kwargs):
is perfectly acceptable usage - f(1, 2, 3) will put 1 and 2 in a and b and args will be [3].
 
Last edited:
  • Like
Likes WWGD
  • #8
WWGD said:
I tried using it to define variance

There are library functions for that.

For values in a pandas DataFrame, there's DataFrame.var.
Python:
df = pandas.DataFrame(...)

#Column-wise variance
df.var()
df.var(axis=0)

# Row-wise variance
df.var(axis=1)

# variance of single column
df.loc[:,column].var()

For general iterables, there's numpy's var
Python:
numpy.var(a_list)

# Should also work on a pandas.Series object:
numpy.var(df[column])
 
  • Like
Likes WWGD
  • #9
pasmith said:
There are library functions for that.

For values in a pandas DataFrame, there's DataFrame.var.
Python:
df = pandas.DataFrame(...)

#Column-wise variance
df.var()
df.var(axis=0)

# Row-wise variance
df.var(axis=1)

# variance of single column
df.loc[:,column].var()

For general iterables, there's numpy's var
Python:
numpy.var(a_list)

# Should also work on a pandas.Series object:
numpy.var(df[column])
Thanks. I was trying to practice by defining it on my own and getting an error re iterating on floats when defining it as sum[ ( x_i-xbar)(x_i-xbar) for x_i in list] where xbar is the mean .
 

FAQ: Elementary Python Questions: Data Frames, k-nary functions

1. What is a data frame in Python?

A data frame in Python is a two-dimensional data structure that is commonly used to store and manipulate tabular data. It is similar to a spreadsheet or a database table, with rows and columns representing different variables or observations.

2. How can I create a data frame in Python?

To create a data frame in Python, you can use the pandas library. First, you need to import the library and then use the DataFrame() function, passing in your data as a list or dictionary.

3. What is a k-nary function in Python?

A k-nary function in Python is a function that takes in k arguments. "K" is a placeholder for any number, and the function can take in any number of arguments as long as it follows the same pattern. For example, a binary function takes in two arguments, while a ternary function takes in three arguments.

4. How do I pass multiple arguments to a k-nary function in Python?

To pass multiple arguments to a k-nary function in Python, you can use the asterisk (*) operator to unpack a list or tuple into individual arguments. For example, if you have a ternary function that takes in three arguments, you can pass in a list or tuple with three elements and use the asterisk operator to unpack them.

5. What are some common use cases for k-nary functions in Python?

K-nary functions are often used in situations where you need to handle a variable number of inputs or outputs. Some common use cases include mathematical operations that can take in multiple numbers, functions that need to work with different types of data, and functions that need to handle optional arguments.

Similar threads

Replies
15
Views
2K
Replies
5
Views
2K
Replies
7
Views
1K
Replies
14
Views
2K
Replies
5
Views
4K
Replies
9
Views
2K
Replies
2
Views
1K
Back
Top