Re: [PPL-devel] integer versus rational solutions

On Mon, 2009-07-13 at 12:07 +0200, Albert Cohen wrote:
Tobias Grosser wrote:
On Fri, 2009-07-10 at 23:57 +0200, Albert Cohen wrote:
Hi Tobias.
Tobias Grosser wrote:
Stop. Do I understand this correctly. You do not want to check dependencies after strip mining? I am very concerned about that, as one of the benefits of working on the polytop model I thought is, that we can schedule transformations in an arbitrary order. So it is required to to reanalyze strip mined loops. This is also necessary for the dependency checking we use for autopar.
As presented in Austin in November, it is much preferable to follow the transformation composition approach of Girbal and Vasilache that does not require intermediate steps of dependence analysis and legality checking.
You are referring to "Facilitating the search for compositions of program transformations"?
Yes. The 2d+1 encoding and the additional invariants/rules allow to compose and generalize classical transformations to a polyhedral representation that may have diverged arbitrarily far from the original syntax tree.
Sure. The same with the polyhedral representation in Graphite. It also should allow us to compose and generalize arbitrary transformations on the polyhedral data structures.
However I do not remember that we specified any 2d+1 encoding in our data structures. So at the moment we are more general, and as you might remember from early discussions it is very important for me to keep this until we have a discussion where we agree to trade of generality for performance. Ideally we would have an example, a compile time problem and a discussion where we agree that adding a assumption like 2d+1 will improve algorithmic complexity and that this improvement is required in real world code. This information should than be part of graphite-poly.h.
(...)
Konrad is working on Step 2.1 (enhancing strip-mine to update the dependence info). Then, if you agree, he will help restructure the API such a that dependence computation and dependence checking are ALWAYS called explicitely, rather than implicitely in the transformation primitives.
Sure. I would like us to be even able to call them after any single transformation.
A loop like would be nice for the optimizations.
while (select_and_apply_transform (scop)) { check_dependencies(); }
Yes. Skipping some checks require stronger technology to guarantee that transformation steps will be legal or will eventually be legal/checked, or even corrected/completed via enabling transformations (see Vasilache's PACT'07 paper).
As a concrete, more original example: together with Razya and Konrad, we are thinking about enabling parloops and interchange with an automatic privatization "corrective step". We would ignore memory-based dependences when looking for interchange combinations that improve both parloops and autovect (using an extended cost model), then check which ones were eventually violated in the selected transformation sequence, and eventually privatize only those scalars/arrays that need to be privatized to effectively remove those memory-based dependences.
That sounds interesting. We also thought about memory optimizations, so I would like the poly_drs to not be read only, but read/write. This means that we can analyze the data accesses on the polyhedral model, change them in the polyhedral data accesses and graphite will take care of creating the required GIMPLE code.
This should allow a lot of nice optimizations, like reduce memory footprint if only a part of an array is written to or allow more parallelism by trading memory for less dependencies.
I know this will be expensive and we want to disable this in the production compiler. But I want us to be able to use this loop for debugging.
Sure.
I totally agree that it is useful to avoid polyhedral computations, as they tend to be expensive. However I still want to be able to do CHECK and ANALYSIS at every single step, as it will show us assumptions that we took to optimize compile time. If we take any assumptions (that might be necessary), I would like them to not be hidden by avoiding dependency ANALYSIS, but I want our (nightly) test to fail.
This will allow a analysis/discussion of the assumptions we took. Maybe we can avoid them, if not we know we have to document them. Every assumption has to be written down at a well known place so that future developers will not be surprised.
OK. But running the ANALYSIS itself after transformation makes little sense to me, even in debugging mode. Can you explain why?
Sure. The idea I have of Graphite is that we have a set of transformations that all analyze GPOLY and try to improve it. However improve does not go only in one direction. For example there might be passes trading dependencies for memory usage, that improve memory usage, but add dependencies or remove dependencies and add memory usage.
I would like to have a clever scheduler (or some database) deciding which passes should be applied. E.g. first we trade some memory for less dependencies, than we try to expose more parallelism, maybe do some interchange to enable fusion and fission, and finally parallelize all the stuff.
To allow this free ordering we have to ensure, that every pass is able to analyze GPOLY to see if it can apply e.g. interchange. So e.g. after strip mining it might still be useful to do some more transformations to allow vectorization. If the vectorization pass can not analyze the GPOLY after strip mining is applied this will not work.
So what I want is:
1. We have one representation GPOLY where we do our optimizations on 2. Every optimization pass works only on GPOLY 2.1 Every pass has to be able to read GPOLY and to generate GPOLY 2.2 Every pass has to decide only by looking and analyzing GPOLY if it can be applied. 3. Optimizing a program is applying a lot of PASSES on GPOLY to generate better scattering and/or better data accesses. 4. Graphite will read GPOLY and following the description in GPOLY generate the GIMPLE code.
It might be possible that this will not work or is to slow. If there is a limitation in GPOLY the solution I see is to extend GPOLY officially (e.g. to handle arbitrary conditions, allow changing the data accesses, ...).
Another thing I do not understand is that you are often referring to "large constants". I thought complexity problems mainly appear because of a high number of dimensions, but not because of high values in a (in)equality. Can you explain this to me?
As mentioned several times, big constants tend to break the "structure" of ILP problems or other polyhedral operations, leading to combinatorial explosions. This is well known even in the case of simple algorithms like Fourier Motzkin. They also increase the risk of memory overflow.
OK. I definitely believe you, but as this seems to become one of the important points for Graphite performance I would like to understand this well. Maybe we can also write a Wiki page to explain which constructs should be avoided. We already learned NNC_Polyhedron are bad, but what else is dangerous.
Here I am still not getting the problem. I looked again into Fourier Motzkin and on the first look I could not find anything in it's complexity depending on the values of the dimensions of the (in)equalities. Any pointer or example that might help me understanding this problem would be really helpful.
FM blows away when it generates too many inequalities. To avoid this, people generally favor eliminations that generate 0 or 1 inequality, first.
Sure. I understand that the number of dimensions and/or (in)equalities is a limitation. What I do not understand is why the values in an (in)equality are problematic. This is why
1 > i might be easier to handle than 99999999 > i
Or did I get you wrong and you did not say anything about this?
Eventually, the bigger the constant the higher the loss of information when relaxing to rational polyhedra.
OK. However when we do not put the additional restrictions the information would never have been available. Or are we loosing other information because of putting additional restrictions?
Sorry, I missed that part. Can you re-explain?
I thought the mail I was referring to is that we put restrictions on parameters in the context of CLooG. As we know that a parameter "k" is of type unsigned integer we know it the constraint 0 <= k <= MAX_INT is always true. By adding this constraint to the context we allow CLooG to remove some conditions, as the generated code assume that k will be in this limits.
If I got your comments right, you warned that adding large constants like MAX_INT might be bad for compile time performance. That is something I did not understand and as adding the constraint to the context improves the code generated by CLooG, I would like to keep adding these constraints as long as compile time allows.
All of this is very informal/intuitive of course. There are some/few theoretical results about this, but Roberto and other experts are better placed than myself to answer it!
@Roberto: Can you give us some hints, what is really expensive and what should we avoid when constructing polyhedrons in the PPL?
Tobi

Hello,
On Mon, Jul 13, 2009 at 1:11 PM, Tobias Grossergrosser@fim.uni-passau.de wrote:
If I got your comments right, you warned that adding large constants like MAX_INT might be bad for compile time performance. That is something I did not understand and as adding the constraint to the context improves the code generated by CLooG, I would like to keep adding these constraints as long as compile time allows.
from my experience, large constants can influence the complexity of many algorithms (e.g. Fourier-Motzkin) in the polytope model, because of the simple fact that at some points, the algorithm has to compute the lowest common denominator (LCD) of entries in matrices. Now, if you combine that with the fact that sometimes you have expressions like "constant-1" or "constant+1", those LCDs can get extremely big, which makes the usage of GMP almost unavoidable and slows down computation a lot. It also might increase memory usage too.
Also, larger constants create more cases where inequalities are not redundant, whereas small constants like 0, 1, 2 often create cases that can be simply ignored because they are already expressed by other inequalities.
I hope I haven't missed the point here, but this is just how I have experienced the problem so far when working with FM and other algorithms on polytopes containing large constants (as they often result from using our tiling technique in LooPo).
If anyone has any approach to how to deal with this specific problem, I would be very interested to hear about it!
greetings, Michael

Michael Claßen wrote:
On Mon, Jul 13, 2009 at 1:11 PM, Tobias Grossergrosser@fim.uni-passau.de wrote:
If I got your comments right, you warned that adding large constants like MAX_INT might be bad for compile time performance. That is something I did not understand and as adding the constraint to the context improves the code generated by CLooG, I would like to keep adding these constraints as long as compile time allows.
from my experience, large constants can influence the complexity of many algorithms (e.g. Fourier-Motzkin) in the polytope model, because of the simple fact that at some points, the algorithm has to compute the lowest common denominator (LCD) of entries in matrices. Now, if you combine that with the fact that sometimes you have expressions like "constant-1" or "constant+1", those LCDs can get extremely big, which makes the usage of GMP almost unavoidable and slows down computation a lot. It also might increase memory usage too.
Also, larger constants create more cases where inequalities are not redundant, whereas small constants like 0, 1, 2 often create cases that can be simply ignored because they are already expressed by other inequalities.
I hope I haven't missed the point here, but this is just how I have experienced the problem so far when working with FM and other algorithms on polytopes containing large constants (as they often result from using our tiling technique in LooPo).
If anyone has any approach to how to deal with this specific problem, I would be very interested to hear about it!
Is approximation acceptable in your application? There are techniques to compute an upward approximation of a given polyhedron (i.e., a polyhedron containing the given one) so as to limit the number of bits of the coefficients.
The idea, due to Goran Frehse, is the following: for each constraint do:
0) start, e.g., with
109*x + 121*y <= 100;
1) truncate homogeneous coefficients to, e.g., 3 bits, obtaining
6*x + 6*y <= ?;
2) solve an LP problem to push out the plane, obtaining:
6*x + 6*y <= 600/109;
3) approximate the inhomogeneous coefficient to next integer, obtaining
6*x + 6*y <= 6.
So the cost is one LP problem per constraint. Whether it is worthwhile or not depends on the application.
With Goran we have a plan to implement this and other techniques in the PPL. If this may be useful to you, we may increase the priority of the job. Cheers,
Roberto

On Mon, Jul 13, 2009 at 3:27 PM, Roberto Bagnarabagnara@cs.unipr.it wrote:
Michael Claßen wrote:
On Mon, Jul 13, 2009 at 1:11 PM, Tobias Grossergrosser@fim.uni-passau.de wrote:
If I got your comments right, you warned that adding large constants like MAX_INT might be bad for compile time performance. That is something I did not understand and as adding the constraint to the context improves the code generated by CLooG, I would like to keep adding these constraints as long as compile time allows.
from my experience, large constants can influence the complexity of many algorithms (e.g. Fourier-Motzkin) in the polytope model, because of the simple fact that at some points, the algorithm has to compute the lowest common denominator (LCD) of entries in matrices. Now, if you combine that with the fact that sometimes you have expressions like "constant-1" or "constant+1", those LCDs can get extremely big, which makes the usage of GMP almost unavoidable and slows down computation a lot. It also might increase memory usage too.
Also, larger constants create more cases where inequalities are not redundant, whereas small constants like 0, 1, 2 often create cases that can be simply ignored because they are already expressed by other inequalities.
I hope I haven't missed the point here, but this is just how I have experienced the problem so far when working with FM and other algorithms on polytopes containing large constants (as they often result from using our tiling technique in LooPo).
If anyone has any approach to how to deal with this specific problem, I would be very interested to hear about it!
Is approximation acceptable in your application? There are techniques to compute an upward approximation of a given polyhedron (i.e., a polyhedron containing the given one) so as to limit the number of bits of the coefficients.
The idea, due to Goran Frehse, is the following: for each constraint do:
- start, e.g., with
109*x + 121*y <= 100;
- truncate homogeneous coefficients to, e.g.,
3 bits, obtaining
6*x + 6*y <= ?;
- solve an LP problem to push out the plane, obtaining:
6*x + 6*y <= 600/109;
- approximate the inhomogeneous coefficient to next integer,
obtaining
6*x + 6*y <= 6.
So the cost is one LP problem per constraint. Whether it is worthwhile or not depends on the application.
With Goran we have a plan to implement this and other techniques in the PPL. If this may be useful to you, we may increase the priority of the job. Cheers,
Roberto
Hello Roberto,
thanks for the clarification. But unfortunately, I need a precise solution, not an approximization. I use the exact number of points in those sets to calculate the position in a buffer.
It might be possible to use a less precise heuristic, but the heuristic would have to satisfy some properties:
1) it should be deterministic, I always have to know where an element is placed inside the buffer 2) if there are two polyhedrons of different size, the heuristic should yield an approximization that is also different in size for the two polyhedrons. This is necessary, because I count the number of elements in the resulting polyhedron in order to compute a buffer location. And this buffer access function has to be injective (but not necessarily bijective!).
I think 1) should be given relatively easily, Unfortunately, I think 2) is probably impossible to guarantee using the above heuristic.
I have to say that it puzzles me that there seems to be so little demand for a precise way of dealing with something like Z-polytopes. I have met a number of people already who are looking for the same thing as I am, but apparently, they are either not complaining or just not aware of this discussion.
Thanks for the efforts anyway, I really appreciate this discussion!
regards, Michael

Michael Claßen wrote:
I have to say that it puzzles me that there seems to be so little demand for a precise way of dealing with something like Z-polytopes. I have met a number of people already who are looking for the same thing as I am, but apparently, they are either not complaining or just not aware of this discussion.
Hi Michael,
complaining would not be very useful :-)
I propose you to start from the beginning. But this time, let us avoid anything that is not precisely defined.
My understanding is that you are interested in Z-polytopes. A polytope is a bounded, convex polyhedron, right?
And a Z-polytope is the intersection between a polytope and an integer lattice, right? (Note: in the PPL "integer lattices" are called "integral grids".)
The PPL provides a way to "combine" a Grid (which can represent any integer lattice), with a C_Polyhedron (which can represent any polytope). But the combination is loose, because the tight integration is computationally intractable.
Now, what operations do you require? I guess you need to know if a Z-polytope is empty, right? This requires to answer, for a given polytope P in R^n described by a system of constraints with rational coefficients, the question:
Is P intersected with Z^n empty?
We currently use a standard, branch-and-bound mixed integer programming optimizer: it may fail to terminate, it may be ridiculously expensive. Can we do better? We got in touch with three leading experts in the field. Two did not bother to reply. *The* leading expert told us that "[Lenstra's algorithm, the algorithm of Gr"otschel, Lov'asz and Schrijver, and the algorithm of Lov'asz and Scarf] are essentially theoretical. I don't know of any implementations." If you know the algorithm we should use, please share.
If we want the tight integration between C_Polyhedron and Grid, we also need to compute a representation of the "integer hull" of P, that is, the convex polyhedral hull of P intersected with Z^n. Which cutting plane algorithm(s) should we use? So far, we got no answer from the experts we contacted. Do you know the answer?
Perhaps I have misunderstood what you mean by a "precise way of dealing with something like Z-polytopes"? Please let us know. Cheers,
Roberto

Hello Roberto,
I hope I can clear up some of the issues in this mail. I apologize ahead if there is still some imprecision in my explanations.
On Mon, Jul 13, 2009 at 5:26 PM, Roberto Bagnarabagnara@cs.unipr.it wrote:
Hi Michael,
complaining would not be very useful :-)
Indeed, I have nothing to complain about, I am very thankful for your fast responses and help. :-)
I propose you to start from the beginning. But this time, let us avoid anything that is not precisely defined.
Yes, agreed.
My understanding is that you are interested in Z-polytopes. A polytope is a bounded, convex polyhedron, right? And a Z-polytope is the intersection between a polytope and an integer lattice, right? (Note: in the PPL "integer lattices" are called "integral grids".)
Yes, agreed.
The PPL provides a way to "combine" a Grid (which can represent any integer lattice), with a C_Polyhedron (which can represent any polytope). But the combination is loose, because the tight integration is computationally intractable.
Ok, I discussed this topic with my colleague Armin Größlinger. It seems that what you mean by "tight integration" is the problem of finding the minimal containing polytope that contains all the points on the integer lattice.
However, we do NOT need this minimal integer hull of our polytope. For our purposes, it is enough to get any description of the set of integer points that is contained in the intersection of the polytope and our grid/lattice. So, the polytope containing these integer points may be slightly too large, it does not have to be minimal. I think this makes finding a representation much easier.
Now, what operations do you require?
We require test for empty, intersection, union and a difference-operator.
1) test for empty z-polytopes should be fairly simple using PIPlib (see below). 2) intersection should also be relatively easy to implement by performing intersection on the polytope and grid 3) union should also be easy to represent in form of a list of Z-polytopes (if a disjoint union is needed, we need a difference operator -> 4) 4) a difference operator seems to be the most complicated operation to implement. It should be possible to compute a complement of a given polytope and the corresponding grid in form of a union of polytopes and a finite set (or list) of complement grids. This representation could get extremely large, but it should be managable if the result is again intersected with our original polytope and grid. So we compute: A \ B = A 'intersect' (complement B)
Armin told me that he started implementing these functions in our HSLooPo framework already, but there seems to be some problem related to handling parameters in a difference operator.
I guess you need to know if a Z-polytope is empty, right? This requires to answer, for a given polytope P in R^n described by a system of constraints with rational coefficients, the question:
Is P intersected with Z^n empty?
yes.
We currently use a standard, branch-and-bound mixed integer programming optimizer: it may fail to terminate, it may be ridiculously expensive. Can we do better? We got in touch with three leading experts in the field. Two did not bother to reply. *The* leading expert told us that "[Lenstra's algorithm, the algorithm of Gr"otschel, Lov'asz and Schrijver, and the algorithm of Lov'asz and Scarf] are essentially theoretical. I don't know of any implementations." If you know the algorithm we should use, please share.
I discussed this problem with Armin. We think that it is possible to use PIP lib for testing for the empty set in this case. The algorithm in PIP is of exponential complexity, if I remember correctly, but at least it is guaranteed to terminate and in practice, I think it is quite feasable. As far as I know, there are already attempts to integrate PIP into PPL, so this seems to not be a big problem.
If we want the tight integration between C_Polyhedron and Grid, we also need to compute a representation of the "integer hull" of P, that is, the convex polyhedral hull of P intersected with Z^n. Which cutting plane algorithm(s) should we use? So far, we got no answer from the experts we contacted. Do you know the answer?
I think this can be answered by the 2 points I mentioned above: 1) we don't need a "minimal" representation of our polytope holding our integer points, only a representation that holds all points 2) we can use PIP for test for empty sets
Perhaps I have misunderstood what you mean by a "precise way of dealing with something like Z-polytopes"? Please let us know.
What I mean is: we need some description that can give us the exact set of all integral points contained in the Z-polytope. But as I described above, this representation does not have to be minimal.
Cheers,
Roberto
I hope I could clear up this misunderstanding, but I am sure we still have some open points to discuss. Will you be participating in one of those weekly telephone conferences of the Graphite project?
best regards, Michael

Hello Roberto,
I just wanted to ask if there are any new developments in this issue of integral polytopes? My colleague Armin Groesslinger will visit me next week and it would be great to to know the current status for any discussions with him. :-)
best regards, Michael
On Fri, Jul 17, 2009 at 4:48 PM, Michael Classenmichael.classen@uni-passau.de wrote:
Hello Roberto,
I hope I can clear up some of the issues in this mail. I apologize ahead if there is still some imprecision in my explanations.
On Mon, Jul 13, 2009 at 5:26 PM, Roberto Bagnarabagnara@cs.unipr.it wrote:
Hi Michael,
complaining would not be very useful :-)
Indeed, I have nothing to complain about, I am very thankful for your fast responses and help. :-)
I propose you to start from the beginning. But this time, let us avoid anything that is not precisely defined.
Yes, agreed.
My understanding is that you are interested in Z-polytopes. A polytope is a bounded, convex polyhedron, right? And a Z-polytope is the intersection between a polytope and an integer lattice, right? (Note: in the PPL "integer lattices" are called "integral grids".)
Yes, agreed.
The PPL provides a way to "combine" a Grid (which can represent any integer lattice), with a C_Polyhedron (which can represent any polytope). But the combination is loose, because the tight integration is computationally intractable.
Ok, I discussed this topic with my colleague Armin Größlinger. It seems that what you mean by "tight integration" is the problem of finding the minimal containing polytope that contains all the points on the integer lattice.
However, we do NOT need this minimal integer hull of our polytope. For our purposes, it is enough to get any description of the set of integer points that is contained in the intersection of the polytope and our grid/lattice. So, the polytope containing these integer points may be slightly too large, it does not have to be minimal. I think this makes finding a representation much easier.
Now, what operations do you require?
We require test for empty, intersection, union and a difference-operator.
- test for empty z-polytopes should be fairly simple using PIPlib (see below).
- intersection should also be relatively easy to implement by
performing intersection on the polytope and grid 3) union should also be easy to represent in form of a list of Z-polytopes (if a disjoint union is needed, we need a difference operator -> 4) 4) a difference operator seems to be the most complicated operation to implement. It should be possible to compute a complement of a given polytope and the corresponding grid in form of a union of polytopes and a finite set (or list) of complement grids. This representation could get extremely large, but it should be managable if the result is again intersected with our original polytope and grid. So we compute: A \ B = A 'intersect' (complement B)
Armin told me that he started implementing these functions in our HSLooPo framework already, but there seems to be some problem related to handling parameters in a difference operator.
I guess you need to know if a Z-polytope is empty, right? This requires to answer, for a given polytope P in R^n described by a system of constraints with rational coefficients, the question:
Is P intersected with Z^n empty?
yes.
We currently use a standard, branch-and-bound mixed integer programming optimizer: it may fail to terminate, it may be ridiculously expensive. Can we do better? We got in touch with three leading experts in the field. Two did not bother to reply. *The* leading expert told us that "[Lenstra's algorithm, the algorithm of Gr"otschel, Lov'asz and Schrijver, and the algorithm of Lov'asz and Scarf] are essentially theoretical. I don't know of any implementations." If you know the algorithm we should use, please share.
I discussed this problem with Armin. We think that it is possible to use PIP lib for testing for the empty set in this case. The algorithm in PIP is of exponential complexity, if I remember correctly, but at least it is guaranteed to terminate and in practice, I think it is quite feasable. As far as I know, there are already attempts to integrate PIP into PPL, so this seems to not be a big problem.
If we want the tight integration between C_Polyhedron and Grid, we also need to compute a representation of the "integer hull" of P, that is, the convex polyhedral hull of P intersected with Z^n. Which cutting plane algorithm(s) should we use? So far, we got no answer from the experts we contacted. Do you know the answer?
I think this can be answered by the 2 points I mentioned above:
- we don't need a "minimal" representation of our polytope holding
our integer points, only a representation that holds all points 2) we can use PIP for test for empty sets
Perhaps I have misunderstood what you mean by a "precise way of dealing with something like Z-polytopes"? Please let us know.
What I mean is: we need some description that can give us the exact set of all integral points contained in the Z-polytope. But as I described above, this representation does not have to be minimal.
Cheers,
Roberto
I hope I could clear up this misunderstanding, but I am sure we still have some open points to discuss. Will you be participating in one of those weekly telephone conferences of the Graphite project?
best regards, Michael

Michael Classen wrote:
I just wanted to ask if there are any new developments in this issue of integral polytopes? My colleague Armin Groesslinger will visit me next week and it would be great to to know the current status for any discussions with him. :-)
Hi Michael,
sorry for the delay: I am pretty busy with other things at the moment. The only new development I can offer right now is that work is about to start to incorporate a Parametric Integer Programming solver into the PPL (we hope this to be more efficient than piplib). All this is meant to happen before the end of September. As far as I understand, with PIP and the code you and Armin Groesslinger are writing, you should obtain what you need. Later we may want to "dress" all this into an Integral_Polytope PPL class, but this is secondary IMHO. All the best,
Roberto

Tobias Grosser wrote:
Sure. I understand that the number of dimensions and/or (in)equalities is a limitation. What I do not understand is why the values in an (in)equality are problematic. This is why
1 > i might be easier to handle than 99999999 > i
Concerning the PPL, the operations performed on coefficients are mostly additions/subtractions, multiplications, GCDs and exact divisions. Addition/subtraction is linear in the number of bits, multiplication and division are (roughly) quadratic, GCD is sub-quadratic since GMP 4.3.0.
Nothing prevents an application of the PPL to build constraints with coefficients with millions of digits. Personally, I never saw that phenomenon but, as I say, this depends on the application. Cheers,
Roberto

Roberto Bagnara wrote:
Tobias Grosser wrote:
Sure. I understand that the number of dimensions and/or (in)equalities is a limitation. What I do not understand is why the values in an (in)equality are problematic. This is why 1 > i might be easier to handle than 99999999 > i
Concerning the PPL, the operations performed on coefficients are mostly additions/subtractions, multiplications, GCDs and exact divisions. Addition/subtraction is linear in the number of bits, multiplication and division are (roughly) quadratic, GCD is sub-quadratic since GMP 4.3.0.
Nothing prevents an application of the PPL to build constraints with coefficients with millions of digits. Personally, I never saw that phenomenon but, as I say, this depends on the application. Cheers,
I never saw this either.
But Tobias is right: I did not answer his question... What was missing is that, in the case of FM, those much-appreciated degenerate cases where 0 or 1 inequality is inserted tend to exist only in "well-behaved" situations with small coefficients (i.e.... degenerate).
Albert

Tobias Grosser wrote:
Sure. The same with the polyhedral representation in Graphite. It also should allow us to compose and generalize arbitrary transformations on the polyhedral data structures.
However I do not remember that we specified any 2d+1 encoding in our data structures. So at the moment we are more general, and as you might remember from early discussions it is very important for me to keep this until we have a discussion where we agree to trade of generality for performance.
Yes, this is a good thing to do. We should keep this generality as much as possible. Some part of the API may require stronger assumtions, however.
Ideally we would have an example, a compile time problem and a discussion where we agree that adding a assumption like 2d+1 will improve algorithmic complexity and that this improvement is required in real world code. This information should than be part of graphite-poly.h.
It is probably best to limit the assumption/restriction to a subset of the API, and to document it in the header files.
Sure. The idea I have of Graphite is that we have a set of transformations that all analyze GPOLY and try to improve it. However improve does not go only in one direction. For example there might be passes trading dependencies for memory usage, that improve memory usage, but add dependencies or remove dependencies and add memory usage.
Yes, like the expansion/contraction tradeoff.
I would like to have a clever scheduler (or some database) deciding which passes should be applied. E.g. first we trade some memory for less dependencies, than we try to expose more parallelism, maybe do some interchange to enable fusion and fission, and finally parallelize all the stuff.
To allow this free ordering we have to ensure, that every pass is able to analyze GPOLY to see if it can apply e.g. interchange. So e.g. after strip mining it might still be useful to do some more transformations to allow vectorization. If the vectorization pass can not analyze the GPOLY after strip mining is applied this will not work.
This is a good argument.
So what I want is:
- We have one representation GPOLY where we do our optimizations on
- Every optimization pass works only on GPOLY
2.1 Every pass has to be able to read GPOLY and to generate GPOLY 2.2 Every pass has to decide only by looking and analyzing GPOLY if it can be applied. 3. Optimizing a program is applying a lot of PASSES on GPOLY to generate better scattering and/or better data accesses. 4. Graphite will read GPOLY and following the description in GPOLY generate the GIMPLE code.
It might be possible that this will not work or is to slow. If there is a limitation in GPOLY the solution I see is to extend GPOLY officially (e.g. to handle arbitrary conditions, allow changing the data accesses, ...).
Sounds good.
Of course, analyzing GPOLY does not necessarily imply recomputing the dependence polyhedra.
But I agree it should always be possible to re-analyze dependence polyhedra.
Without an exact Z-polyhedral enveloppe, we'll never guarantee that we do not lose key information...
Unfortunately, stripmining will ALWAYS be one of the worst transformations in terms of information loss on a rational enveloppe. Whatever the encoding you can think of. The only workaround is to delay it as much as possible. For example, Pluto does not effectively strip-mine until all the heuristic transformation (finding hyperplanes and fusion/distribution options) is complete.
Parametric tiling approaches remove strip-mining from the polyhedral representation altogether, but they have limitations, see Renganarayanan et al. (multidimensional tiling for the price of one, PLDI'07, SC'07) and Hartono et al. (ICS'09). We can discuss it later, this is still too incomplete.
I thought the mail I was referring to is that we put restrictions on parameters in the context of CLooG. As we know that a parameter "k" is of type unsigned integer we know it the constraint 0 <= k <= MAX_INT is always true. By adding this constraint to the context we allow CLooG to remove some conditions, as the generated code assume that k will be in this limits.
This should be quite good in general: MAX_INT kind of constants only occur as the affine constant part, not as coefficient of variables. They allow to cut branches in complex unions of polyhedra and simplify the computation, without adding overhead or information loss. If those constants occur as coefficients, like for the strip-mining case, it hurts a lot (performancewise and precisionwise).
If I got your comments right, you warned that adding large constants like MAX_INT might be bad for compile time performance. That is something I did not understand and as adding the constraint to the context improves the code generated by CLooG, I would like to keep adding these constraints as long as compile time allows.
Albert
participants (6)
-
Albert Cohen
-
Michael Classen
-
Michael Claßen
-
Michael Claßen
-
Roberto Bagnara
-
Tobias Grosser