I'll add that in non-parametric statistics we can have specific distribution assumptions, but indeed it's probably less often.
Also non-parametric may also refer to the fact, that we do not parametrize the structure (which is not always the same as parametrizing the distribution) of our statistical model.
In the context of statistical models, non-parametric may also mean that inorder to make a prediction with our model we usually need the whole training set (or at least the required amount of data is of about the same order as the size of the training set). Consider the example of linear regression vs. localized linear rigression:
In linear regression, you could have some training set and estimate that y = a*x+b. Note that we have assumed a certain structure on the data, and this structure is described with 2 parameter. Now if you want to make an out-of-sample prediction (for some given x), all you need to know are the two parameters (a,b). You can send these 2 numbers to your friend (who doesn't have the original training set), and he can make the same predictions. This is parametric regression.
In localized linear regression, if you want to make an out-of-sample prediciton (say for some x'), you take all points from your training set that have their x-s close to x' and do a (perhaps weighted) linear regression using only those points. You get some parameters (a,b) and use those to output your prediction y. Note that (a,b) will depend on x', i.e. if you want to make a prediction for different x', you will have to take different points from your training set and your (a,b) will likely be different. Unlike the previous example, our assumption about the structure of the data is much weaker: instead of assuming a linear relationship, we only assume that relationship between y and x is only locally linear.
If you want your friend to be able to make prediction, you will have to send him the whole training set, not just a couple of parameters. That's why is non-parametric.