MHB Summing up the divisors raised to power k

  • Thread starter Thread starter Saitama
  • Start date Start date
  • Tags Tags
    Power
AI Thread Summary
The discussion revolves around optimizing a solution for the SPOJ problem IITD4, which is facing time limit exceeded (TLE) issues. The original code attempts to compute a function involving the sum of powers of divisors but is inefficient due to its nested loops and divisor checks. Suggestions for optimization include reducing the number of iterations by focusing on prime factors instead of all divisors, leveraging the properties of the divisor function σ_k(n) to simplify calculations. Implementing a sieve to generate primes and using these primes to calculate contributions to the sum is recommended. The conversation also highlights the need to avoid expensive operations like dynamic array resizing during prime storage. Additionally, reversing the order of loops to reduce complexity and employing memoization for repeated calculations of powers are suggested strategies to improve performance. The discussion concludes with a focus on refining the approach to efficiently compute the required sums while addressing the TLE problem.
Saitama
Messages
4,244
Reaction score
93
Here is the problem I am trying to solve: SPOJ.com - Problem IITD4

Following is what I tried:

Code:
#include <iostream>
#include <cstdio>
#include <cmath>
#define MOD 1000000007
using namespace std;

typedef long long ll;

ll modularPower(ll base, ll exponent)
{
	ll res=1;
	while(exponent)
	{
		if(exponent%2)
			res=(res*base)%MOD;
		exponent/=2;
		base=(base*base)%MOD;
	}
	return res;
}

ll f(ll i, ll k)
{
	ll j, u, sum=0;
	u=sqrt(i);
	for(j=1; j<=u; ++j)
	{
		if(i%j==0)
		{
			sum=(sum+modularPower(j, k));
			sum=(sum+modularPower(i/j, k));
		}
	}
	if(u*u==i)
	{
		sum=(sum-modularPower(u, k)+MOD)%MOD;
	}
	return sum;
}

int main()
{
	int T, a, b, k, i;
	scanf("%d", &T);
	while(T--)
	{
		ll ans=0;
		scanf(" %d %d %d", &a, &b, &k);
		for(i=a; i<=b; ++i)
			ans=(ans+f(i,k))%MOD;
		cout<<ans<<endl;
	}
}

..but this gives TLE. What optimisations should I try? :confused:

Thanks!
 
Technology news on Phys.org
Pranav said:
Here is the problem I am trying to solve: SPOJ.com - Problem IITD4

Following is what I tried:

Code:
#include <iostream>
#include <cstdio>
#include <cmath>
#define MOD 1000000007
using namespace std;

typedef long long ll;

ll modularPower(ll base, ll exponent)
{
	ll res=1;
	while(exponent)
	{
		if(exponent%2)
			res=(res*base)%MOD;
		exponent/=2;
		base=(base*base)%MOD;
	}
	return res;
}

ll f(ll i, ll k)
{
	ll j, u, sum=0;
	u=sqrt(i);
	for(j=1; j<=u; ++j)
	{
		if(i%j==0)
		{
			sum=(sum+modularPower(j, k));
			sum=(sum+modularPower(i/j, k));
		}
	}
	if(u*u==i)
	{
		sum=(sum-modularPower(u, k)+MOD)%MOD;
	}
	return sum;
}

int main()
{
	int T, a, b, k, i;
	scanf("%d", &T);
	while(T--)
	{
		ll ans=0;
		scanf(" %d %d %d", &a, &b, &k);
		for(i=a; i<=b; ++i)
			ans=(ans+f(i,k))%MOD;
		cout<<ans<<endl;
	}
}

..but this gives TLE. What optimisations should I try? :confused:

Thanks!

Hi Pranav,

The number of iterations in the loop in your function [m]f[/m] can be reduced.
Instead of iterating over all divisors, we can limit ourselves to the primes.

Note that $F(n,k)$ is the divisor function $\sigma_k(n)$.
In particular, if we break up $n$ into its primes, the summation turns into a summation of a set of geometric series.
So suppose we have the primes and their powers in $n$:
$$n=\prod_{i=1}^r p_i^{a_i}$$
Then:
$$\sigma_k(n) = F(n,k) = \prod_{i=1}^r \frac {p_i^{(a_i+1)k} - 1 }{p_i^k - 1}$$
 
I like Serena said:
Hi Pranav,

The number of iterations in the loop in your function [m]f[/m] can be reduced.
Instead of iterating over all divisors, we can limit ourselves to the primes.

Note that $F(n,k)$ is the divisor function $\sigma_k(n)$.
In particular, if we break up $n$ into its primes, the summation turns into a summation of a set of geometric series.
So suppose we have the primes and their powers in $n$:
$$n=\prod_{i=1}^r p_i^{a_i}$$
Then:
$$\sigma_k(n) = F(n,k) = \prod_{i=1}^r \frac {p_i^{(a_i+1)k} - 1 }{p_i^k - 1}$$

So I guess I have to implement a sieve and make an array of primes <=10^5. Umm...I wonder if it will pass all the cases but still, I will implement this tomorrow morning. Meanwhile, I found the following solution which does not require implementing sieve. I tried to understand it but I am stuck at the ndiv function used. Any idea what it is used for? :confused:

Code:
#include <iostream>
#include <math.h>
using namespace std;
#define ULL unsigned long long
# define MOD 1000000007;
int ndiv(int n, int m, int p)
{
    return m/p - (n-1)/p;

}

unsigned long long fsum(unsigned long long a, unsigned long long b)
{
    return (a+b)%1000000007;
}
unsigned long long fprod(unsigned long long a,unsigned long long b)
{
    return ((a)*(b))%1000000007;
}
ULL Pow(ULL a, ULL b)
{
    if(a==1)
    {
        return 1;
    }

    ULL x=1,y=a;
    while(b != 0)
    {
        if(b&1)
        {
            x=(x*y);
            if(x>1000000007) x%=1000000007;
        }
        y = (y*y);
        if(y>1000000007) y%=1000000007;
        b /= 2;
    }
    return x;
}
ULL G(ULL a, ULL b, ULL k)
{
    long long z = 0;
    for(int i = 1; i<=b; i++)
    {
        z += fprod(ndiv(a,b,i),Pow(i,k));
    }
    z%=MOD;

    return z;
}
int main()
{
    int T;

    cin>>T;
    ULL a,b,k;
    for(int q = 1; q<=T; q++)
    {
        ((cin>>a)>>b)>>k;
        cout<<G(a,b,k);
        if(q!=T)
        {
            cout<<endl;
        }
    }
    return 0;
}
 
It is as I feared, I am still getting TLE. Following is the code I have written:

Code:
#include <iostream>
#include <vector>
#define MOD 1000000007
using namespace std;

typedef long long ll;

bool isPrime[100010];
vector<int> p;

void sieve() {
    
    for(int i = 0; i <= 100000;++i) {
        isPrime[i] = true;
    }
    isPrime[0] = false;
    isPrime[1] = false;
    for(int i = 2; i * i <= 100000; ++i) {
         if(isPrime[i] == true) {
             // Mark all the multiples of i as composite numbers
             for(int j = i * i; j <= 100000 ;j += i)
                 isPrime[j] = false;
        }
    }
    for(int i=0; i<=100000; ++i)
    	if(isPrime[i])
    		p.push_back(i);
}

ll modularPower(ll base, ll exponent)
{
	ll res=1;
	while(exponent)
	{
		if(exponent%2)
			res=(res*base)%MOD;
		exponent/=2;
		base=(base*base)%MOD;
	}
	return res;
}

ll f(int i, int k)
{
	int j;
	ll product=1, count=0, aux1, aux2;
	for(j=0; j<p.size(); ++j)
	{
		if(p[j]>i)
			break;
		if(!(i%p[j]))
		{
			count=0;
			while(i%p[j]==0)
			{
				i/=p[j];
				count++;
			}
			aux1=modularPower(p[j],(count+1)*k)-1;
			aux2=modularPower(p[j],k)-1;
			aux2=modularPower(aux2, MOD-2);
			product=(((product*aux1)%MOD)*aux2)%MOD;
		}
	}
	return product;
}

int main() {
	sieve();
	int T, a, b, k;
	cin>>T;
	while(T--)
	{
		cin>>a>>b>>k;
		ll ans=0;
		for(int i=a; i<=b; ++i)
			ans=(ans+f(i,k))%MOD;
		cout<<(ans+MOD)%MOD<<endl;
	}
	return 0;
}

Have I implemented your idea incorrectly? Please let me know.

Also, any clue about the ndiv function in the code I posted in my previous post? Thanks! :)
 
Pranav said:
It is as I feared, I am still getting TLE. Following is the code I have written:

Have I implemented your idea incorrectly? Please let me know.

Your code looks correct except for one thing.

The numerator is guaranteed to be divisible by the denominator.
However, I'm afraid that's no longer guaranteed once we take them modulo $m$.
To evaluate $\frac a b \bmod m$, I'm afraid we will have to find $(b \bmod m)^{-1}$ with the Euclidean algorithm. (Sweating)And as an observation [m]p.push_back(i)[/m] is pretty expensive, since it triggers a reallocation for every iteration.
I've just noticed that the time limit is 0.53 seconds and just these push_back's might already cause an timeout.
Also, any clue about the ndiv function in the code I posted in my previous post? Thanks! :)

With a timeout of 0.53 seconds, we'll have to be smarter.
One of the loops has to be collapsed.
So let's see what happens if we reverse them.

Suppose we want to find G(1,6,2).

$$\DeclareMathOperator{\ndiv}{ndiv}
G(1,6,2) = \sum_{n=1}^6 \sigma_2(n) = \sum_{n=1}^6 \sum_{d|n} d^2 \\
= 1^2 \\
+ 1^2 + 2^2 \\
+ 1^2 \phantom{+ 2^2} + 3^2 \\
+ 1^2 + 2^2 \phantom{+ 3^2} + 4^2 \\
+ 1^2 \phantom{ + 2^2} \phantom{ + 3^2} \phantom{ + 4^2} + 5^2 \\
+ 1^2 + 2^2 + 3^2 \phantom{ + 4^3} \phantom{ + 5^2}+ 6^2 \\
= \ndiv(1, 6,1) \cdot 1^2 + \ndiv(1, 6,2)\cdot2^2 + ... + \ndiv(1,6,6)\cdot 6^2
$$
where $\ndiv(a,b,i)$ is the number of times that a number between $a$ and $b$ is divisible by $i$.

More generally:
$$G(a,b,k) = \sum_{n=1}^b \ndiv(a,b,n) \cdot n^k$$

Now how many numbers between $1$ and $b$ are divisible by, say, $2$? (Wondering)
 
I like Serena said:
And as an observation [m]p.push_back(i)[/m] is pretty expensive, since it triggers a reallocation for every iteration.
I've just noticed that the time limit is 0.53 seconds and just these push_back's might already cause an timeout.
I tried replacing the vector with an array but I still get TLE.

With a timeout of 0.53 seconds, we'll have to be smarter.
One of the loops has to be collapsed.
So let's see what happens if we reverse them.

Sorry, I am a little confused here, which loop are you talking about? :confused:

Suppose we want to find G(1,6,2).

$$\DeclareMathOperator{\ndiv}{ndiv}
G(1,6,2) = \sum_{n=1}^6 \sigma_2(n) = \sum_{n=1}^6 \sum_{d|n} d^2 \\
= 1^2 \\
+ 1^2 + 2^2 \\
+ 1^2 \phantom{+ 2^2} + 3^2 \\
+ 1^2 + 2^2 \phantom{+ 3^2} + 4^2 \\
+ 1^2 \phantom{ + 2^2} \phantom{ + 3^2} \phantom{ + 4^2} + 5^2 \\
+ 1^2 + 2^2 + 3^2 \phantom{ + 4^3} \phantom{ + 5^2}+ 6^2 \\
= \ndiv(1, 6,1) \cdot 1^2 + \ndiv(1, 6,2)\cdot2^2 + ... + \ndiv(1,6,6)\cdot 6^2
$$
where $\ndiv(a,b,i)$ is the number of times that a number between $a$ and $b$ is divisible by $i$.

More generally:
$$G(a,b,k) = \sum_{n=1}^b \ndiv(a,b,n) \cdot n^k$$

Now how many numbers between $1$ and $b$ are divisible by, say, $2$? (Wondering)

Thanks a lot, I finally understand it! :)
 
Pranav said:
I tried replacing the vector with an array but I still get TLE.

For the record, you could make another optimization by storing the result in a table whenever you calculate $p_i^k$ and use the table entry if you already have it (so called lazy evaluation).

But I'm pretty sure it won't be enough to get below 0.53 seconds.
Pranav said:
Sorry, I am a little confused here, which loop are you talking about? :confused:

The outer loop is the summation from $a$ to $b$.
Within that loop we have a summation of the k-powers of the divisors (or primes in your latest version).
The most inner loop is to calculate the powers.

I'm talking about the order of the 2 summations - the outer loops.
They live in different functions, but I still consider them nested loops.

Effectively we've reversed the order of those loops and then collapsed the loop that runs from $a$ to $b$.

Thanks a lot, I finally understand it! :)

Good! (Mmm)
 
Back
Top