OpenBLAS crashes on Ubuntu 15.04

  • Thread starter Thread starter Pablo Brubeck
  • Start date Start date
  • Tags Tags
    matlab ubuntu
Click For Summary

Discussion Overview

The discussion revolves around system crashes occurring on Ubuntu 15.04 when attempting to perform matrix operations (reduction, inversion, factorization) on large matrices (10000x10000) using OpenBLAS and LAPACK libraries in Julia and MATLAB. Participants explore potential causes, including memory limitations and software compatibility.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Exploratory

Main Points Raised

  • One participant reports that their system restarts without error messages when processing large matrices, suspecting issues with linear algebra libraries.
  • Another participant suggests testing with smaller matrices (100x100) to see if the crashes persist, indicating that smaller matrices do not cause crashes.
  • A participant notes that their Windows laptop, despite having lower specs, can handle large matrices without crashing, implying a potential difference in software behavior between operating systems.
  • There is a discussion about the operating system versions, with one participant assuming a 32-bit version of Ubuntu might be in use, but it is later clarified that both systems are 64-bit.
  • A participant describes the issue as a kernel panic, attributing it to potential memory corruption and providing details on how to check memory usage with the 'free' command.
  • Another participant calculates the memory requirements for the matrix operations, suggesting that the operations may exceed available memory and recommending adding swap space or more RAM as potential solutions.
  • One participant corrects a previous calculation regarding memory size, leading to an acknowledgment of the correction from another participant.

Areas of Agreement / Disagreement

Participants express differing views on the cause of the crashes, with some attributing it to memory limitations while others suggest software compatibility issues. No consensus is reached on a definitive solution or cause.

Contextual Notes

Participants mention various factors that could influence the behavior of the software, including memory usage, operating system architecture, and the specific libraries in use. The discussion does not resolve the underlying technical issues or assumptions regarding system configurations.

Pablo Brubeck
Messages
6
Reaction score
0
Whenever I try to reduce, invert, or factorize matrices of size 10000x10000 my whole system suddenly restarts without any error message. This happens on both julia and MATLAB when I run the command A=rand(10000,10000)^-1;

I suspect the problem is due to the linear algebra libraries (I have openblas 0.2.12-1 and lapack 3.5.0-4). I'm running Ubuntu 15.04 on an Intel Core i7-4790K, 2x8GB RAM Kingston Fury, Asus Z97-P motherboard, and NVIDIA 980 GTX gpu.

Help please, and thanks.
 
Physics news on Phys.org
It may very well be the size of your matrix - it is slightly less than 1Gbyte in size. Try with a 100x100 matrix first and see if it still crashes.
 
Svein said:
It may very well be the size of your matrix - it is slightly less than 1Gbyte in size. Try with a 100x100 matrix first and see if it still crashes.
It does not crash with small matrices, it can work well with 1000. My windows laptop with lower specs can work with those large sizes.
 
Pablo Brubeck said:
My windows laptop with lower specs can work with those large sizes.
Let me guess - you are running a 64bit version of Windows on your laptop and 32bit version of Ubuntu...
 
Svein said:
Let me guess - you are running a 64bit version of Windows on your laptop and 32bit version of Ubuntu...
Both are 64bit OS running on 64bit machines. It seems that the problem is not present when using octave.
 
Last edited:
What you have on Ubuntu is called a kernel panic - usually from corruption of
kernel data often due to a programming error in user space piddling in kernel space.

The linux command
Code:
free
will show you how much memory is in use at
any given time. Normally, a single user systems will have about 90% of memory
free - note that those buffers you see come and go dynamically

From the man7.org man page information on the free(1) command
free displays the total amount of free and used physical and swap
memory in the system, as well as the buffers and caches used by the
kernel. The information is gathered by parsing /proc/meminfo. The
displayed columns are:

total Total installed memory (MemTotal and SwapTotal in
/proc/meminfo)

used Used memory (calculated as total - free - buffers - cache)

free Unused memory (MemFree and SwapFree in /proc/meminfo)

shared Memory used (mostly) by tmpfs (Shmem in /proc/meminfo,
available on kernels 2.6.32, displayed as zero if not
available)

buffers
Memory used by kernel buffers (Buffers in /proc/meminfo)

cache Memory used by the page cache and slabs (Cached and Slab in
/proc/meminfo)

buff/cache
Sum of buffers and cache

available
Estimation of how much memory is available for starting new
applications, without swapping. Unlike the data provided by
the cache or free fields, this field takes into account page
cache and also that not all reclaimable memory slabs will be
reclaimed due to items being in use (MemAvailable in
/proc/meminfo, available on kernels 3.14, emulated on kernels
2.6.27+, otherwise the same as free)

Now you have a tool.

Your matrix is a lot larger than mentioned before. 64 (bits in a signed
integer) * 10000 *10000 is ~6.4GB, with 80 bits in a double precision floating
point varable it is ~8.0GB. Math packages tend to use existing numeric formats
unless you are using extended precision - like in GMP.

The free tool will give you available. If you need more memory you will have to add swap space.
Virtual memory = swap (paging) file size plus memory. This is temporary fix.
Add more RAM is another possibility but has a monetary penalty.

Also consider letting the people who support the problem software packages know
of the issue. This is important in the long run.

Since octave does not exhibit the problem use it instead if the above is too
much of a problem. But still consider reporting the problem.
 
jim mcnamara said:
64 (bits in a signed integer) * 10000 *10000 is ~6.4GB, with 80 bits in a double precision floating
point varable it is ~8.0GB.
Umm - 64bits = 8bytes...
 
Thank you - you are absolutely right.
 

Similar threads

  • · Replies 11 ·
Replies
11
Views
16K