You are viewing a single comment's thread from:

RE: Particle physics @ utopian-io - Objects isolation, histogramming and a first task request

in #utopian-io6 years ago (edited)

Completely unrelated to the rest, there is no dedicated module allowing one to read a histos.saf file and get plots out of it. I would like to get a Python code (potentially relying on matplotlib) allowing to do so.

This caught my attention. I spent some of the spring semester writing custom R code to create histograms (actually ended up with probability distro curves, but histos were created along the way) from the raw per-cell data from some fluid dynamics experiments. For example, plotting the volumetric distribution of the kinetic energy dissipation rate throughout the reactor, rather than relying on an average value.

edit: Forgot to ask my actual question. Do you know wow utopian deals with multiple people implementing answers to task requests?

edit: Forgot to ask my actual question. Do you know wow utopian deals with multiple people implementing answers to task requests?

Sort:  

A few more questions:

  • Are the histogram names within the *.saf file unique (will "ptj1" only show up once?)
  • There's no info on the x, y, and title in the file, what are the defaults you'd like?
  • Are the under/overflow bins shown in the histogram?
  • Are you looking for a python module which can be called from code or a command line tool?
  • Do you want this to produce an image file or mataplotlib object which can be further edited?

To make sure I understand the data format:

The total value corresponding to each bin is obtained from the sum of the numbers of the two columns.

For the following data:

 <Data>
       0.000000e+00   0.000000e+00    # underflow
       0.000000e+00   1.000000e+00    # bin 1 / 3
       3.000000e-04   0.000000e+00    # bin 2 / 3
       2.000000e-04   3.000000e-04    # bin 3/ 3
       0.000000e+00   0.000000e+00    # overflow
  </Data>

The value of bins 1, 2, and 3 are 1, 3x10-4, and 5x10-4, respectively. Further, we don't need to show this split in the histogram. Correct?

Here are answers to your questions. Do not hesitate to come back to me if necessary.

Are the histogram names within the *.saf file unique (will "ptj1" only show up once?)

All histograms are concatenated within a given SAF file.

There's no info on the x, y, and title in the file, what are the defaults you'd like?

The title is given between quotes, right after the description. The axis names are not passed. You can leave them blank. However, it may be nice to allow the user to give them.

Are the under/overflow bins shown in the histogram?

It would be nice to leave this option to the user.

Are you looking for a python module which can be called from code or a command line tool?

Both can be done quite easily.

Do you want this to produce an image file or mataplotlib object which can be further edited?

Once again, both. The matplotlib file is important to allow the user to tune it to the layout he/she wants.

The value of bins 1, 2, and 3 are 1, 3x10-4, and 5x10-4, respectively. Further, we don't need to show this split in the histogram. Correct?

The total number to put in a bin corresponds to the first column minus the second one.

Thanks for the clarifications. I think I understand it enough to get an implementation coded up.

I will double check the code anyways :)

Hi, sorry if this is strange question but I don't understand the resulting bin data values in the SAF file.

I thought that in the histogram the bin value should be whole numbers representing the number of particles matching a particular value range.

Am I wrong about this?

Apologies for a very late answer... Complicated times for me at the moment...

I thought that in the histogram the bin value should be whole numbers representing the number of particles matching a particular value range.

It depends what you want to plot.

If you are interested in the transverse momentum distribution of all jets, then each jet of the event will contribute to the histogram (to potentially different bins). On the other hand, you may be interested in the transverse momentum distribution of the leading (or first jet) of the events. Then, only the first jet matters and will yield an entry in the histogram.

We can also be interested in plotting global variables, like the total activity in the events, which do not depend on the actual number of particle but instead of the sum of their energy.

Does it clarify?

Thanks, yes I think that clarifies it.
I will submit corrections shortly, probably today.
Cheers!

Cool! Good luck with this!

I've got some good news. This is still janky as heck, but I've got a python tool which parses the saf file, converts it into a pandas dataframe, and allows me to spit out some histograms.

The janky part is that the histograms are still essentially a proof-of-pipeline, matplotlib has turned out to be very different (not bad, just different) from the plotting language (ggplot2) I am used to.

first_histo.png

This image will get better between now and Friday. My plan to is push my local repo to a fork of ma5 on my github account, then send you a pull request on Friday. Based on how that goes and your feedback, then write up a post.

The downside is that I probably won't have time for 1c, but I can deal with that. It's been very informative using pandas and matplotlib on non-toy data.

Super! This looks very nice. Just two comments:

  • Would it be possible to have different figures for each histogram?
  • What do you mean by a fork to MA5? Can't your code be used externally?

I can give one more week for 1c, as I still haven't found the time to start writing 1d ^^

I can certainly make it multiple figures, i was actually pondering that over morning coffee.

The code exists as a standalone thing, but utopian would like your github ma5 to be the main repo, so I am developing it in a new directory under tools. This also helps with the criterion that a task is accepted by an author by incorporating a pull request.

I am not sure to understand the utopian request. This addon was supposed to stay independent of madanalysis, as it is only useful once the working directory has been created. I will see with utopian.

One thing that' been consistently an issue is that the utopian workflow is based on largely monolithic projects. Frameworks which generate code or tiny tools complicate the idea of what the main repo should be.

If you want a standalone tool, it might be easier to write this as an idea post, and have me write it up as a developer, rather than as a task request on an existing project.

Loading...

This caught my attention. I spent some of the spring semester writing custom R code to create histograms (actually ended up with probability distro curves, but histos were created along the way) from the raw per-cell data from some fluid dynamics experiments. For example, plotting the volumetric distribution of the kinetic energy dissipation rate throughout the reactor, rather than relying on an average value.

Actually, I insist on the python option. The reason is that the code is meant to be used by physicists, and physicists do not use R in general (python, c++ and Fortran are the mostly spread programming languages).

edit: Forgot to ask my actual question. Do you know wow utopian deals with multiple people implementing answers to task requests?

I have no idea. I suppose all will be rewarded if they are nicely done. On my side, everything that works well will be advertised on the MadAnalysis 5 website. The best module may even be merged with the main branch of madanalysis 5, provided the author agrees.

No worries on the Python requirement, happy to code in that environment.

Coin Marketplace

STEEM 0.20
TRX 0.13
JST 0.030
BTC 64752.70
ETH 3455.13
USDT 1.00
SBD 2.50