Jump to content

Wikipedia:Reference desk/Mathematics

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Icthyos (talk | contribs) at 13:18, 8 May 2012 (Is there an "abelian version" of the free product?). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Welcome to the mathematics section
of the Wikipedia reference desk.
Select a section:
Want a faster answer?

Main page: Help searching Wikipedia

   

How can I get my question answered?

  • Select the section of the desk that best fits the general topic of your question (see the navigation column to the right).
  • Post your question to only one section, providing a short header that gives the topic of your question.
  • Type '~~~~' (that is, four tilde characters) at the end – this signs and dates your contribution so we know who wrote what and when.
  • Don't post personal contact information – it will be removed. Any answers will be provided here.
  • Please be as specific as possible, and include all relevant context – the usefulness of answers may depend on the context.
  • Note:
    • We don't answer (and may remove) questions that require medical diagnosis or legal advice.
    • We don't answer requests for opinions, predictions or debate.
    • We don't do your homework for you, though we'll help you past the stuck point.
    • We don't conduct original research or provide a free source of ideas, but we'll help you find information you need.



How do I answer a question?

Main page: Wikipedia:Reference desk/Guidelines

  • The best answers address the question directly, and back up facts with wikilinks and links to sources. Do not edit others' comments and do not give any medical or legal advice.
See also:


April 27

Invariant theory: matrices and invariants under upper triangular matrices

Hi all, I'm learning some invariant theory for rings and I'm getting myself a bit confused with this question - feel like I might have made a mistake, and would appreciate some feedback from someone more experienced than me.

If acts on , we write for the invariant ring. In particular, if an element g acts on some finite dimensional vector space W over , then for an element of the coordinate ring , we have an action on f by .

We let denote the 2x2 matrices over , denote the upper triangular unipotent matrices (that is, matrices with '1's on the diagonal, a zero below and anything above), and denote the matrices of the form . These both act on by (left) matrix multiplication. I wish to find and , which I think means the polynomials (i.e. polynomials in the coordinate ring ) which are fixed under the map for any matrix M in respectively.

So a matrix in U looks like ; this takes . An f which is invariant under this transformation can be considered (i think) as rather than , so that : so the question is essentially asking us, if I understand correctly, to find the polynomials f which are invariant under that transformation. What can we say about such polynomials? I tried expanding it as

for all i, j, k, l, t.

I then tried looking at this as either a polynomial in or a polynomial in ; we know that all coefficients of are always zero for , and these coefficients are polynomials in the and the ; what I really wanted to do is show that these coefficients are necessarily nonzero polynomials in the , unless we assume all involved are zero; i.e. our polynomial can only be a polynomial in the latter two variables, otherwise it is not fixed for every t.

However, when I tried to determine the coefficient of as a poly in the , I find that multiple terms in can contribute to the same term in the coefficient of , and in fact the function satisfies the requirements but is obviously a function of all 4 variables (note that this is effectively the determinant, though I don't know if that has any significance). Indeed, functions such only in the latter 2 variables are *included* in our class of possible functions, but they don't make up the whole thing. I'm not sure what more we can say about the class; by choice of t I think we can deduce for some h, but I'm not sure where to go from here.

What is the invariant ring exactly? Is it just (where e happens to equal ad-bc)? And likewise with the diagonal matrices, I think we get that all terms in the polynomial must be of the form to be invariant, so would we deduce the invariant ring is or something like that? Or maybe just any old 3-variable since we effectively have 3 variables? Sorry for the long question, I've just started learning invariant theory and I'm still finding it a bit confusing. Thank you for your help :) 86.26.13.2 (talk) 08:14, 27 April 2012 (UTC)[reply]


April 28

Cubic Interpolation Polynomials

Related to the interpolation polynomial I asked above, in one dimension if I have two points and I know both the function values and the derivatives at those two points, I can fit a unique cubic polynomial through them because I will have four unknowns and four constraints. How can I extend this to 2 dimensions? Let's say I have four points (one cell). What would be the form of the polynomial? Would it be a general bivariate cubic

or a bicubic like this one

The first one has fifteen unknowns and the second has eight. So which constraints are natural to use here? I am thinking of four function values, four x-partials, and four y-partial at each of the four points. But this only gives me twelve constraints. Are some constraints redundant or am I missing some? I am just confused about the "natural" extension of the above method to two variables. Thanks!75.171.224.232 (talk) 00:49, 28 April 2012 (UTC)[reply]

Bicubic interpolation may get you started. —Tamfang (talk) 17:40, 28 April 2012 (UTC)[reply]

Right, I have looked at that article (and others on interpolation I could find on wikipedia) but my question is a little different. If I have ten points and I want to use cubic splines then I will need nine cubic polynomials, one for each interval. The way these "standard" cubic splines work is that I derive a large system and then solve for all nine cubic simultaneously. The problem I have is that I have too many points so interpolating this way is infeasible. So I was just asking if anyone knows of a way where I can solve only one cubic (for one interval) at a time as I need them. Linear interpolation would do this but I also need the gradient to be continuous. So then I said to myself, in one dimension, if I have two points x1 and x2 and I know f(x1),f(x2),f'(x1),f'(x2) then I can find the cubic polynomial for that one interval without having to solve for all others. But I am a little confused how to extend this to two dimensions (or even higher). If I can do something like this it will solve a lot of problems I am having. Any ideas? Thanks.75.171.224.232 (talk) 20:50, 28 April 2012 (UTC)[reply]

Have you looked at Numerical Recipies http://www.nr.com/ They have a much better treatment of the subject that wikipedia and will focus more on application issues. Most of the interpolation methods they use are piecewise ones which only use a small number of points at a time.--Salix (talk): 22:23, 28 April 2012 (UTC)[reply]
One 2D technique they mention (NR 3.6.3) is first calculating values for first derivatives , and cross derivative for each point. With this information you can then treat each patch separately using bicubic interpolation.--Salix (talk): 23:36, 28 April 2012 (UTC)[reply]

Yep, I looked at it and looks very promising. Thanks for this suggestion! One question though, did I read it right? They say (if I don't know the values of the partials at the grid points) then basically just assume whatever values you want? This way I will have the smoothness I want. It may not be very accurate but it will be smooth.75.171.224.232 (talk) 03:41, 30 April 2012 (UTC)[reply]

Yes you can just assume the values for the partials. For example in 1D you could assume the derivatives are alway zero giving a smooth but bumpy surface. Better is to estimate them using finite differences.--Salix (talk): 05:16, 30 April 2012 (UTC)[reply]

Trigonometric Interpolation

Second interpolation question, Bo Jacoby, your idea of using trigonometric interpolation is a pretty cool idea. I didn't even think about it. I only have one question about that if someone can clear it up. I remember from my class that a trig polynomial is

so in two dimensions (so that I can use the FFT), what is the form of the polynomial? Is it

analogous to how we generalize polynomial interpolation like bicubics? Thank you.75.171.224.232 (talk) 01:03, 28 April 2012 (UTC)[reply]

Thanks for asking! Here are some tricks.
  1. The expression cos(2πx)+i sin(2πx) was (by Euler) identified as the complex exponential function e2π i x. This identification makes life easier. Forget all about cos and sin and stick to the exponential e2π i x.
  2. The identity e2π i=1 invites to the (nonstandard) notation 1x = e2π i x . Most people think that 1x = 1, but that convention is useless - nobody write 1x meaning 1. When x is rational then e2π i x is algebraic. So forget about the transcendental numbers e and 2π and stick to the root of unity 1x.
  3. For fixed positive N, there are only N different values of 1n / N . You may think of n as an integer modulo N. The summation sign Σn means summation over N consecutive values of n. It doesn't matter if it is from n=1 to N, or from n=0 to N−1 , or from n=−(N−1)/2 to (N−1)/2 for odd N, or from n=−N/2 to N/2−1 for even N.
  4. A trigonometric polynomial is f(x) = Σn an 1n x / N/√N where the amplitudes are an = Σx f(x) 1n x / N/√N. Note that complex conjugating the amplitudes is nonstandard. The standard discrete fourier transform has different procedures for transforming forwards and backwards(!).
  5. A two dimensional trigonometric polynomial is f(x,y) = Σn Σm anm (1n x / N/√N) (1m y / M/√M) where the amplitudes are anm = Σx Σy f(x,y) (1n x / N/√N) (1m y / M/√M).
  6. For interpolation, first transform the function values to amplitudes. Then extend the array of amplitudes from (−N/2 ≤ n < N/2) to (−kN/2 ≤ n < kN/2) for some k>1, with zeroes for the high frequencies. (bN/2=bN/2=aN/2/2, bn=an for −N/2 < n < N/2, bn=b−n=0 for N/2 < n < kN/2, bkN/2=0). Multiply the amplitudes by √k. Transform from amplitudes to function values.
Bo Jacoby (talk) 12:01, 28 April 2012 (UTC).[reply]
For example. x=(0,1,2,3,4,5,6,7,8). f(x)=(2,3,4,6,5,2,1,4,2). The amplitudes are an=(9.67,-0.953+2i,-0.929-1.76i,-0.333+1.15i,0.382+0.573i,0.382-0.573i,-0.333-1.15i,-0.929+1.76i,-0.953-2i). Insert 9 zeroes and multiply by √2 : bn=(13.7,-1.35+2.83i,-1.31-2.49i,-0.471+1.63i,0.54+0.81i,0,0,0,0,0,0,0,0,0,0.54-0.81i,-0.471-1.63i,-1.31+2.49i,-1.35-2.83i). Transform again: g(x)=(2, 2.83, 3, 3.12, 4, 5.3, 6, 5.8, 5, 3.72, 2, 0.706, 1, 2.7, 4, 3.5, 2, 1.34). The even items reproduce the original data, g(2x)=f(x), and the odd items interpolate between them. Bo Jacoby (talk) 11:51, 3 May 2012 (UTC).[reply]

More Interpolation

Let's say I want to interpolate using a ridiculously large number of points (like a trillion points). I know their x-values and the corresponding y-values and I want the interpolation function and its first derivative to be continuous. For this, cubic splines are perfect but the problem is that the resulting system is far too large. A trillion points mean 999,999,999,999 cubic polynomials with four unknowns in each. And the system will have something like (4-trillion)^2 entries. Using even single precision (4 bytes per entry), I would need like a lot of hard drive/RAM space. Granted it will be tri-diagonal, I can save memory using sparse structure, and I can use specialized algorithms like Thomas' algorithm to solve it, it is still too large and will take too much time and memory.

Is there any way (any cool tricks anyone knows of) where I can only solve one cubic polynomial in each interval (from n-th to n+1st point) as I need them? I would much rather solve a bunch of small systems as I need them instead of solving them all in preprocessing. This is why I asked about the cubic spline question above. I figured a cubic in any one interval can be solved independently if I give it four constraints. It doesn't have to be cubic. I would love to hear all of your thoughts on any of this or any cool idea anyone knows of. How can I compute a interpolating function on one cell only, such that the function and its derivative/gradient would be continuous if I compute and stitch all of them together? Please remember that eventually I will be doing this in at least three dimensions so the problem will only become worse much much quickly thanks to the curse....of dimensionality.75.171.224.232 (talk) 01:19, 28 April 2012 (UTC)[reply]

I'm not convinced that I understand your question. No matter how many intervals you have, a cubic for each interval depends on the same number of local values. —Tamfang (talk) 17:48, 28 April 2012 (UTC)[reply]
I think the problem is that while the datapoints are know, the gradients are not. In fitting the splines you need to adjust adjacent patches to match gradients. First thoughts are you could try to drastically reduce the size of you dataset before fitting splines. There are lots of techniques for this start with Dimension reduction. How accurately are the data points obtained will there be some error in their measurement? If so beware of overfitting. There quite a few google hits for "interpolation large datasets" with some interesting techniques, Multi Level B-Spline might be promising.--Salix (talk): 19:01, 28 April 2012 (UTC)[reply]
To assign a first derivative to each sample point, in a one-dimensional sequence, you fit a quadratic to it and its two neighbors; if the samples are equally spaced, this will give a tangent slope equal to the slope of a straight line between the two adjacent samples. In multiple dimensions I imagine it's similar. If the sample points are not a grid, you'd probably have to start with Delaunay triangulation. —Tamfang (talk) 18:48, 1 May 2012 (UTC)[reply]
You can always compute a patch-wise spline and then blend the patches together. For example, if you have 200 points, you might do the following:
  1. Compute 3 splines:
    • One on the first 100 points, S1(x)
    • One on points 51-150, S2(x),
    • One on points 101-200, S3(x)
  2. Define weight functions:
    • w1(x) = 1 on points 0-50, declines linearly to zero over points 51-100, and zero otherwise
    • w2(x) = grows linearly from zero on points 51-100, declines linearly to zero over points 101-150, and zero otherwise
    • w3(x) = grows linearly from zero on points 101-150, 1 on points 151-200, and zero otherwise
  3. Define a composite curve: S(x) = (w1(x)*S1(x) + w2(x)*S2(x) + w3(x)*S3(x)) / (w1(x) + w2(x) + w3(x));
It should be obvious how to extend this to arbitrarily many patches. This will be continuous. With just a little more effort, you can make the blending function smooth and ensure the composite curve is also completely smooth. You aren't guaranteed to have the same result as the full spline, but as long as the patches are large, the edge effects will be small and you get a good approximation of the full spline (assuming you aren't overfitting in the first place). Dragons flight (talk) 19:23, 1 May 2012 (UTC)[reply]

Request for applications

Could anyone supply me with explicit information about the application of second order ODEs? Firstly, I am interested in the case of linear homogeneous ODEs, i.e.

where a, b and c are constants and a is non-zero. Secondly, I am interested in the non-homogeneous case:

where ƒ is a sum, or difference of, exponential and trigonometric functions. Fly by Night (talk) 21:38, 28 April 2012 (UTC)[reply]

Hooke's law gives an application of the homogeneous case (with a and c terms in your equation). The b term is a damping term (see damping). The f represents the application of a force external to the system. Many other mechanical systems have similar interpretations for their respective differential equations (e.g., RLC circuits). Sławomir Biały (talk) 01:15, 29 April 2012 (UTC)[reply]
The special case cy=f is Hooke's law, that force is proportional to displacement: more load makes a spring longer. The case by'=f is Aristotle saying that force is proportional to velocity: more horses draw a wagon faster. The case ay"=f is Newton's law that force is proportional to acceleration: the falling apple. Bo Jacoby (talk) 04:10, 29 April 2012 (UTC).[reply]
Second-order fixed-coefficients differential equations sometimes come up in economics, with f being either zero or a non-zero constant. Two examples are given in Chiang, Alpha C., Fundamental Methods of Mathematical Economics, McGraw-Hill, 3rd edition, 1984, pp.529-540. One of them equates supply and demand for a good, with demand being affected not only by price P but also by the trend in price given by dP/dt and d2P/dt2. The other one involves the economy-wide inflation rate and its first and second derivatives, as it relates to the unemployment rate. In both examples the first and second derivatives appear because of the assumptions about how price or inflation expectations are formed. Duoduoduo (talk) 17:34, 29 April 2012 (UTC)[reply]
An RLC circuit current and voltage response are described with second order ODEs, with f(t) being an external signal (current or voltage) applied to the circuit. --CiaPan (talk) 05:37, 30 April 2012 (UTC)[reply]


April 29

Music computation

Given the beginning of a musical composition, how to mathematically compute the rest? 118.70.177.182 (talk) 04:37, 29 April 2012 (UTC)[reply]

Of one that already exists, or for the generation of new music?--Gilderien Chat|List of good deeds 11:13, 29 April 2012 (UTC)[reply]
Of course, there are multiple ways to end a piece from a given start. However, you could make a program that would apply a standard series of variations (changes in instrumentation, substituting chords for single notes, counterpoint, changes in octave, etc.) to a given musical phrase. StuRat (talk) 11:49, 29 April 2012 (UTC)[reply]
Maybe Analytic continuation?-77.127.57.229 (talk) 12:43, 29 April 2012 (UTC)[reply]

Latex package for Dynkin diagrams

Does anyone know of a decent LaTeX package for drawing Dynkin diagrams? I can only make them look nice if I draw them "manually" (basically with bullets and rules, manually adjusting every space so that it comes out right). I have tried (perhaps inexpertly) to use xypic as well, but the diagrams come out too big (and just not looking quite right either). Does anyone know of a more targeted solution? Sławomir Biały (talk) 12:07, 29 April 2012 (UTC)[reply]

Are you aware of http://lesha.goder.com/dynkin-diagrams.html ? Looie496 (talk) 17:54, 29 April 2012 (UTC)[reply]
Yes, I had seen that a long time back. They didn't really look good enough to me to warrant the pain of and lack of flexibility of importing eps files. I'm looking for a flexible (or at least hackable) LaTeX solution. Given that things like xypic and amscd are possible, surely there must be a native LaTeX solution out there? Sławomir Biały (talk) 21:10, 29 April 2012 (UTC)[reply]
I've been using tikz to draw finite automata recently. It's a graph drawing package, and it should be able to do Dynkin diagrams too; I didn't read any more of the documentation than what I needed for my own stuff, but I noticed it has support for multiple edge types, so it probably has what you need.--130.195.2.100 (talk) 23:26, 29 April 2012 (UTC)[reply]
Thanks, that looks very promising. I'll give it a try. Sławomir Biały (talk) 12:30, 30 April 2012 (UTC)[reply]
PGF/TikZ :-) 86.104.57.242 (talk) 10:23, 6 May 2012 (UTC)[reply]
Have you tried dynkin-diagrams (https://ctan.org/pkg/dynkin-diagrams)?

April 30

question about one sided limits

f is differentiable at all points except possibly at 0, and f is a continuous at 0. Probe that f is differentiable at 0.--49.178.5.29 (talk) 00:05, 30 April 2012 (UTC)[reply]

And if you're not an alien with your probe handy, you can prove it instead. :-) StuRat (talk) 00:12, 30 April 2012 (UTC) [reply]
Hint: Use the definition of derivative. To bound , use an intermediate point , bound and , and use the Mean value theorem. -- Meni Rosenfeld (talk) 08:48, 30 April 2012 (UTC)[reply]

Calculating the 3 marks that drag a weighted average down the most

Hello,

Apologies if this is a very easy problem, I’m certainly no mathematician.

My university calculates an ‘honours mark’ for each student. The honours mark is basically an average of each student’s marks, weighted according to the number of credit points attributed to that subject. However, Students may discount their 3 ‘worst’ subjects.

If a student’s 3 worst subjects are those that drag their weighted average down the most, how does one go about calculating what those 3 marks are? (Apart from trial and error).

Thanks, Joaq99 (talk) 04:16, 30 April 2012 (UTC)[reply]

Perhaps you could find the weighted average of all the classes, then find the 3 whose deviations from that average, when multiplied by the number of credits, is the most negative number ? StuRat (talk) 04:28, 30 April 2012 (UTC)[reply]
StuRat: can you prove that rule correct? Because I think it might not be. – b_jonas 11:20, 30 April 2012 (UTC)[reply]
StuRat: for example, suppose you wanted to throw away just two marks from the following four:
name grade weight
P 1 9
Q 1 1
R 3 10
S 5 10
Now the weighted average is 3, so the deviations from average multiplied by the weight are, respectively, -18, -2, 0, 20, so if I understand your rule correctly, you'd throw away the marks for subjects P and Q, which would give a weighted average of 4. However, it's better to throw away subjects P and R, as that would lead to a grade average near 4.64. – b_jonas 11:30, 30 April 2012 (UTC)[reply]
In fact it's not even true when you need to remove one mark. If A, B and C have weights 1, 2, 1 and grades 1, 0, -0.3, then the StuRat power of B is -0.35 and for C is -0.475, but the average after removing B is 0.35 and after removing C is 1/3. -- Meni Rosenfeld (talk) 11:45, 30 April 2012 (UTC)[reply]
Yes, my method is an approximation, which works in the real world, where you don't get a 10 to 1 ratio in the number of credit per class. To add a bit of a safety factor to it, you could try removing, say, each 3 of your 4 "most negative" classes, calculated by the method I specified. StuRat (talk) 17:21, 30 April 2012 (UTC)[reply]
What is this "real world" of which you speak? :)
Anyway, you'll find that my counterexample has 1:2 weight ratio. Your method is most accurate when there are many classes, in which case removing items doesn't have a great effect on the denominator. -- Meni Rosenfeld (talk) 18:32, 30 April 2012 (UTC)[reply]
Yes, and also where class grades tend to vary more than class credits. StuRat (talk) 19:07, 30 April 2012 (UTC)[reply]
[ec] I don't think this is a trivial problem. StuRat's suggestions is an approximation but the exact result is different, and the marks that are each optimal individually needn't be optimal together.
A mark that is dominated by at least other marks (they each have both lower grade and higher weight) cannot be in the optimal set, so these can be discarded (which is and can reduce the effective value of n). But other than that I don't know of a better way than scanning all possibilities, which is in the general case.
Some optimization is possible by calculating each average in rather than . -- Meni Rosenfeld (talk) 11:36, 30 April 2012 (UTC)[reply]
Meni: nice example for removing a single grade. I agree that this is an interesting mathematical problem, even though with real life students and grade averages it's feasable to do a brute force computer solution, or a hand computation with fast runtime for typical input.
Now as for the actual question. If you wanted to remove just one grade, then you could compute the average without each grade all in linear time. Would it give the correct result to just iterate this, repeatedly throwing away a grade in a greedy way?
Also, as a clarification, can we assume that the grades are limited to just a few values (say integers from 1 to 5)? If so, that would make this simpler. – b_jonas 12:41, 30 April 2012 (UTC)[reply]
Greedy doesn't work. Let A, B, C, D have grades 7, 2, 2, 0 and weights 2, 2, 2, 1. If you can remove one mark it should be D, but if you can remove two it's B and C. -- Meni Rosenfeld (talk) 12:59, 30 April 2012 (UTC)[reply]
Huh? I can't reproduce that one. If you can remove two, it should be C and D to get an average of -1, because if you removed B and C the weighted average would be -1.4. – b_jonas 14:59, 30 April 2012 (UTC)[reply]
Sorry, had a typo, weight of A should be 1 (2 now that I've rescaled). -- Meni Rosenfeld (talk) 15:45, 30 April 2012 (UTC)[reply]
Ah, indeed, it does work that way. Now I should try to understand why it works. – b_jonas 16:15, 30 April 2012 (UTC)[reply]
I can't say I truly get it myself. But here are two ways to think about it (which may be easier now that I've rescaled the example to use only nonnegative integers):
  • Use the StuRat approximation. D has slightly more StuRat power so if only one item is removed, it should be it. If two need to be removed, then clearly B needs to be one of them. Once that's done, the average is higher; since C has greater weight than D, this has a greater effect on its StuRat power, which now exceeds D's. (This may or may not be literally true for this example, didn't check).
  • Consider the numerator and denominator of the weighted average, and how removing an items affects them both. With one item removed, it should be D because of its effect on the numerator. The less the total weight, the more significant is the effect on the denominator; so after B is removed, C is next because removing it greatly reduces the denominator.
-- Meni Rosenfeld (talk) 18:32, 30 April 2012 (UTC)[reply]
Another way to look at it: I have a backpack. I want to cram it full of grades. I can only hold so much weight and the value per weight of each grade is not the same. How can I maximize the value? Remind anyone of an old, well-documented problem? — Preceding unsigned comment added by 128.23.112.209 (talk) 18:38, 30 April 2012 (UTC)[reply]

The invalidity of the one-at-a-time approach strikes me as somehow related to Simpson's paradox -- both involve comparing things with different but overlapping denominators.

Also, this problem of efficiently finding the ones to remove seems very similar to this problem: given a regression equation in which it is postulated that k of the n data points are outliers that should not have been included in the regression (because they may be from a different causative structure), how do you efficiently find the set of k data points whose deletion will most improve the fit of the regression? See for example Cook's distance and Outlier#Identifying outliers. Duoduoduo (talk) 19:31, 30 April 2012 (UTC)[reply]

Just an aside, but the harder the true answer is to find, the less likely that the university is actually applying the "correct" solution. Joaq99, I would surprised if the university has really thought this out. More likely someone is doing something simple, like canceling the three lowest grades regardless of weight. While the mathematical puzzle is undoubtedly interesting, you might be more likely to get at the truth by inquiring with your university about what procedure they are actually using to remove the "worst" grades. Dragons flight (talk) 19:45, 30 April 2012 (UTC)[reply]

This http://www.ics.uci.edu/~eppstein/pubs/EppHir-TR-95-12.pdf paper addresses this particular problem. --Modocc (talk) 20:55, 30 April 2012 (UTC)[reply]

Dragons flight: in a realistic case, it's completely feasable for the university to even use a brute force computation with a computer. We have implicitly started discussing the abstract problem where the number of grades can be large, and the allowed grade values and the allowed grade weights needn't be integers taken from a very small predetermined set. – b_jonas 21:26, 30 April 2012 (UTC)[reply]
Modocc: nice find! That's indeed exactly the same problem. – b_jonas 21:29, 30 April 2012 (UTC)[reply]
Notably, they give an algorithm. -- Meni Rosenfeld (talk) 05:21, 1 May 2012 (UTC)[reply]
From the above conversation it seems that a 'brute force' calculation is the way to go. I'm assuming this refers to a computer program that calculates the weighted average eliminating every possible combination of 3 and simply takes the maximum of all the weighted averages.
I would like to thank everyone for their efforts (even if you enjoy it, it's much appreciated). I'm interested in this discussion even beyond its practical implications for me.
Dragons flight -- The university was initially removing the 3 subjects with the lowest (mark * weight). When I complained that this was clearly wrong, the university said they would change their approach to simply removing the 3 subjects with the lowest marks. I'm unhappy with this approach too but I thought that before I proceeded with the complaint I should try and figure out the correct way to do so. I soon figured out that finding such a method was beyond my high school mathematics skills. Joaq99 (talk) 01:30, 1 May 2012 (UTC)[reply]
Lowest mark*weight, or lowest (mark-average)*weight (StuRat power)? So a grade of 1 with weight 1 will be removed rather than a grade of 1 with weight 100? Most negative StuRat power is a handy approximation; lowest grade is a handy approximation; lowest mark*weight is just idiotic and whoever is responsible for it should be fired.
They should just use brute force. It takes exactly 5 minutes to write a program to do that. Is the final grade of all university students really that unimportant? -- Meni Rosenfeld (talk) 04:30, 1 May 2012 (UTC)[reply]
Good luck with that Joaq. I've been part of several universities, and though they all had many smart faculty, I've pretty much invariably found that the staff responsible for processing grades, transcripts, and the like had trouble understanding all but the most trivial of mathematics. I'm not sure why that should be, but it certainly seems like whatever the actual requirements for that job (mostly clerical skills I assume), that numerical literacy was not among them. The only calculations I really trust them to do and describe accurately are the very simplest imaginable. Of course that's just my experience, your experience may be better. Dragons flight (talk) 05:29, 1 May 2012 (UTC)[reply]
A closer look: If the sum of all weights is W and the sum of all weight*grade is T, then the average starts at . If a class with weight and grade is dropped, the new average is . The second term has the StuRat power in it which is why the approximation works when W is large enough.
If you drop two classes, the average is
That is, to second order you get the same terms as with dropping each individually, but with the additional term which can give an advantage over the best individual class to drop. In particular, a class which has a low weight and low grade will greatly reduce the part (because the weights are multiplied) but will only somewhat increase (since the grades are added, and a doubling of the distance from the average is diluted by the other grade), thus it may be better to drop two classes of higher grade but also higher weight. -- Meni Rosenfeld (talk) 10:58, 1 May 2012 (UTC)[reply]
The Eppstein-Hirschberg paper cited above by Modocc, despite a promising opening paragraph, is not about the same problem as we are talking about here. That paper maximized where is a grade and is a weight (see their eqs. (1) and (2)). But the grade-point averages that we are trying to maximize are . Duoduoduo (talk) 17:34, 1 May 2012 (UTC)[reply]
Our problem is trivially reduced to the problem in the paper by setting . -- Meni Rosenfeld (talk) 18:12, 1 May 2012 (UTC)[reply]

May 1

May 2

May 3

Octahemioctahedron

Octahemioctahedron

Why is the octahemioctahedron the only orientable hemipolyhedron? Double sharp (talk) 12:23, 3 May 2012 (UTC)[reply]

It seems from the article that there are only 9 hemipolyhedrons (I don't know why these are the only ones). It's not hard to check that the octahemioctahedron is the only orientable one. Look at the faces that don't pass through the center. Ones that meet at a vertex have the opposite orientation. In order for the polyhedron to be orientable, there can't be any odd walk through the outside faces sharing vertices. This only happens on the octahemioctahedron (which is due to the fact that the outside faces are arranged like the vertices of a cube, which is bipartite). Rckrone (talk) 05:37, 5 May 2012 (UTC)[reply]
Thank you for your clear explanation. It's not hard to prove that there are only 9 hemipolyhedra, by the way. The hemipolyhedra are closely related to the quasiregular polyhedra. Some edges of the quasiregular polyhedra with four faces at a vertex form the faces passing through the centre of the hemipolyhedra: for example, the cuboctahedron can generate regular hexagons. To keep the polyhedron having two faces at an edge, one set of faces must be discarded: discarding the triangles gives the cubohemioctahedron, while discarding the squares gives the octahemioctahedron. This doesn't work for the quasiregular polyhedra having six faces at a vertex (the three ditrigonal polyhedra), because they don't have "equators". Since there are 5 quasiregular polyhedra with four faces at a vertex (the rectified Platonic and Kepler-Poinsot solids), and each can generate 2 hemipolyhedra (except the octahedron as a tetratetrahedron, because both types of faces are the same, so it can only generate one). Hence there can only be (4 × 2) + 1 = 9 hemipolyhedra. (Other polyhedra, such as the regular dodecahedron, can also have facets that pass through the centre, but these facets are not regular polygons and so no new uniform hemipolyhedra are obtained.) Double sharp (talk) 03:23, 6 May 2012 (UTC)[reply]

4-fold and 5-fold rotational symmetry

Why is it that Euclidean polyhedra cannot have both 4-fold and 5-fold rotational symmetry? What about polychora (4-polytopes) and higher? Double sharp (talk) 12:31, 3 May 2012 (UTC)[reply]

I think is due to the fact that if 4 pentagons met at a vertex, the total angle here would be 4 × 108° = 432. Slightly more rigorously would be to consider Möbius triangles. As 1/2 + 1/4 + 1/5 < 1 the corresponding Möbius triangle with angles π/2, π/4, π/5 will lie in the hyperbolic plane giving the Order-5 square tiling or Order-4 pentagonal tiling. Whats confusing me at the moment is Schwarz triangle with angles π/2, π/4, 2π/5, here 1/2 + 1/4 + 2/5 > 1 so it should be spherical, but it does not seem to exist.--Salix (talk): 07:37, 5 May 2012 (UTC)[reply]
Perhaps it covers an irrational fraction of the sphere? Double sharp (talk) 03:46, 6 May 2012 (UTC)[reply]

How widespread is the notation

in elementary schools around the world? (meaning a divided by b).— Preceding unsigned comment added by ‎ OsmanRF34 (talkcontribs) 21:51, 3 May 2012

According to the Long division article, its the notation used in English speaking countries, China, Japan and India. There are several different notations used one in Brazil, Venezuela and Colombia; one in the rest of Latin America; one in Spain, Italy, France, Portugal, Romania, Turkey, Greece and Russia; one in Germany, Norway, Macedonia, Poland, Croatia, Slovenia, Hungary, Czech Republic, Slovakia and in Bulgaria. --Salix (talk): 22:49, 3 May 2012 (UTC)[reply]

May 4

Uniform dual polyhedra

If all the Catalan solids have constant dihedral angles, do all the duals of the uniform polyhedra (the nonconvex ones) also have constant dihedral angles? (I suspect the answer is yes.) Double sharp (talk) 03:12, 4 May 2012 (UTC)[reply]

Yeah, the canonical dual having constant dihedral angles follows from the original polyhedron having regular polygons as faces. It shouldn't depend on convexity of the polyhedron, or even convexity of the faces. Rckrone (talk) 04:58, 5 May 2012 (UTC)[reply]

May 5

Summation

Could someone please explain to me why . I can see we're using the 2 j=i+r+1 but I don't understand why the binomial coefficient ends up the way it does. Thanks. 131.111.184.11 (talk) 12:51, 5 May 2012 (UTC)[reply]

The binomial coefficients are reversed in the order of summation with the substitution you have used. Rather consider the substitution j = mi, and you will have to swap the limits because generally mr + 1. — Quondum 13:23, 5 May 2012 (UTC)[reply]
double counting? -- Taku (talk) 10:33, 6 May 2012 (UTC)[reply]

Normalise a DE

What does it mean to "normalise" a DE? I am told to normalise by the substitution . My working out is . . So the DE is . .


So the DE is . Doesn't look very normal to me. 150.203.114.37 (talk) 20:11, 5 May 2012 (UTC)[reply]

Did you mean ? Otherwise your substitution looks off. 129.234.53.19 (talk) 21:41, 5 May 2012 (UTC)[reply]
Yes, that is what I meant. 150.203.114.37 (talk) 22:12, 5 May 2012 (UTC)[reply]
Except it actually is . WHOOPS. I'll correct it and come back if I'm still stuck. 150.203.114.37 (talk) 22:14, 5 May 2012 (UTC)[reply]
OK, so the DE is I think. Still curious to know what "normalise" means. 150.203.114.37 (talk) 01:11, 6 May 2012 (UTC)[reply]

May 6

pearsons r and F test

When computing an F test and Person's r, if significance testing is conducted for both tests using the same set of data, why is it possible to achieve contrary results? please give the answer

Information theory and infinities

(Note that all of this is from a completely lay perspective): where would I go to read about the parts of information theory that deal with "infinite amounts" of information? Is that even a thing? For example, if I have some real number a (unspecified beyond the fact that it is real), then revealing that it is, say, a trillion has narrowed down my a-space from being uncountably large to being a single number. Using the traditional definition of information (base-2 log of the size of the old possibility space over the size of the new one), then I have transmitted infinite information (of course you couldn't actually construct a real-life function that outputs any real number with uniform probability, but I think the idea is clear). --122.150.246.242 (talk) 07:49, 6 May 2012 (UTC)[reply]

I think you're confusing information that a number is a real with knowing the number in "full detail", i.e. all digits. The former is not "infinite amount of information" anymore than linking to pi would make this page to never end loading :-) 86.104.57.242 (talk) 11:11, 6 May 2012 (UTC)[reply]
I do know all the digits. There's a 1, then 12 zeroes, then a decimal point, then infinitely many zeroes. A trillion, like I said. I narrowed down "any real number" to a particular real number, not "one of ten digits" to a particular digit. Put it this way: if I say, "a can be any real number, with uniform probability", then what is the entropy of a? --122.150.246.242 (talk) 11:45, 6 May 2012 (UTC)[reply]
Differential entropy - the entropy of a continuous distribution - is relevant. If X and Y are uniform and normal variables of the same mean and variance, then transmitting either of them exactly requires infinite information, but transmitting Y still requires a few bits more than X. So it's the difference in information content that is studied.
This can also be considered in terms not of transmitting the value of X exactly, but rather of the limiting process of transmitting it with arbitrarily high accuracy. The amount of information depends on the accuracy, but for any given accuracy, different variables will require different amounts of information, and this is the difference in their entropy. -- Meni Rosenfeld (talk) 12:09, 6 May 2012 (UTC)[reply]


If you can transmit an arbitrary real number, then you can indeed transmit infinitely much information. In some sense it's true that every real number (not just random-looking ones) has infinite information content; if you choose a real number from a uniform distribution on [0,1] and it turns out by random chance that it comes out to be exactly zero point five zero repeating, then yes, that event encodes infinitely many bits of information. Of course the probability of the number coming out to be exactly that is zero (same as the probability of it being any other particular number). --Trovatore (talk) 00:45, 8 May 2012 (UTC)[reply]

Is there an "abelian version" of the free product?

I'm thinking of an operation that would glue together two structures (say two groups G1, G2) such that any pair with one element from G1 and one from G2 would commute, e.g. the free abelian group of rank 2 would be the result of this operation applied to two free abelian groups of rank 1. If not for groups, is it possible to define it for categories/semigroups etc.? Thank you for your time. 86.104.57.242 (talk) 10:20, 6 May 2012 (UTC)[reply]

Maybe I'm misunderstanding, but doesn't the direct product do this? Staecker (talk) 12:58, 6 May 2012 (UTC)[reply]
It sounds like you want the result to be abelian, even if the components aren't. Then the best you'll find for groups will be to first abelianize, then take the direct product. But the abelianization of a group might be trivial, so you'll probably find this unsatisfying.--130.195.2.100 (talk) 00:17, 7 May 2012 (UTC)[reply]
You can take the free product of G1 and G2 and then force elements of G1 to commute with elements of G2 without abelianizing G1 and G2 themselves, but like Staecker said this is exactly the same as the direct product. Rckrone (talk) 00:42, 8 May 2012 (UTC)[reply]
Yeah, this is what I had in mind. Thanks. 86.104.57.242 (talk) 09:29, 8 May 2012 (UTC)[reply]
This notion extends to a class of groups called graph products (an unfortunately vague name). They are neatly defined by choosing a (finitely generated) group for each vertex of a (finite) simplicial graph, taking their free product, and quotienting out to get the commuting relations you want, whenever the groups' vertices are joined by an edge in the graph. If your graph is complete, you just get a direct product as mentioned above. However, you can get much more interesting groups. For instance, the class of graph products contains all right-angled Artin groups, which are interesting in and of themselves. Icthyos (talk) 13:18, 8 May 2012 (UTC)[reply]

Meaning of a symbol in a specific context

In [1] on page 1, theorem 2 says

If p is an odd prime number, then .

What does the dot before the stand for? -- Toshio Yamaguchi (tlkctb) 12:14, 6 May 2012 (UTC)[reply]

Looks to me like multiplication, probably a typo. "Proof 1 of Theorem 2" proves it as if it were multiplication. Staecker (talk) 12:47, 6 May 2012 (UTC)[reply]
Okay, thanks. Looking at some other formulas in the paper it seems this is a result of the typeface used in the paper. -- Toshio Yamaguchi (tlkctb) 12:58, 6 May 2012 (UTC)[reply]
It's a typical LaTeX document. The author seems to have used "." instead of "\cdot". Staecker (talk) 12:59, 6 May 2012 (UTC)[reply]
The author ment to write:
If 2n+1 is a prime number, then .
Bo Jacoby (talk) 15:18, 6 May 2012 (UTC).[reply]

vector identities

Use identities to show that where and is a constant vector? Which identities do I use? Widener (talk) 17:55, 6 May 2012 (UTC)[reply]

Green's theorem

Consider the integral where is the ellipse . Explain why Green's theorem cannot be applied to the region interior to C to evaluate the above integral, and show that a blind application of this theorem would lead to the incorrect result . I simply don't know why Green's theorem doesn't work. It looks like it should. As for applying Green's theorem, I get which is a monstrosity. Can someone evaluate this (and is what I've done so far correct)? Widener (talk) 18:28, 6 May 2012 (UTC)[reply]

The problem is that the functions are not defined in (0,0) which is in the region bounded by the curve. The way to handle this problem is to draw a circle around (0,0), and since this is a conservative field in the region between the curves, the integral on the outer curve is equal to the one on the inner curve (which is simpler).
You have a missing negation in evaluating the integral, the two derivatives need to cancel each other out. The integral you wrote is not only unwieldy, it doesn't converge. -- Meni Rosenfeld (talk) 20:20, 6 May 2012 (UTC)[reply]
Do the derivatives cancel? Now I get . The derivatives don't have to cancel in order for the integral to be zero. Widener (talk) 20:51, 6 May 2012 (UTC)[reply]
The derivatives don't have to cancel, but in this case they do. This kind of problems is usually given with a conservative field. Also note that your latest integrand is nonnegative, so the integral can't be 0.
You have another derivative mistake in addition to the sign; it should be . -- Meni Rosenfeld (talk) 04:14, 7 May 2012 (UTC)[reply]

Second order ODEs

Hi folks. Can anyone supply some nice (applied) problem sheets for second order, linear, homogeneous ODEs? I'm trying to write a homework sheet, but I want to make the questions seem relevant. There's only so many times I can mention springs (simple and damped harming motion) before it wears thin. Can anyone suggest a link to some homework problems? Fly by Night (talk) 22:18, 6 May 2012 (UTC)[reply]

Have you seen the examples in the article on oscillation? Note that the second order linear homogenous ODE with constant coefficients also describe the onset of explosively increasing oscillations. Bo Jacoby (talk) 06:07, 8 May 2012 (UTC).[reply]

May 7

Beta Function

For the beta function, I have the relation , for Re(u)>0 and Re(v)>0. Given this, I have to show that the beta function is analytic except for simple poles and then find the residues at the poles. Now, it is clear from this relation that , so we see that the beta function has simple poles at the non-positive integers. Does this suffice for showing that it is analytic except for the poles that we have located? And how does one go about determining the values of the residues at the poles? Thanks. meromorphic [talk to me] 10:50, 7 May 2012 (UTC)[reply]

I think this argument only works if you know that B(u,v) has no poles for Re(u)>0 and Re(v)>0. Then for any point in C2 you can pull this pole-free region back to it the way you suggested. Edit: Actually if you only have this relation for Re(u)>0 and Re(v)>0 then it tells you nothing about the values of B(u,v) for Re(u)<0. Edit again: Although I guess maybe you get that they agree below Re(u)>0 when you define these values with analytic continuation, but I don't really remember anything about analytic continuation. Rckrone (talk) 01:37, 8 May 2012 (UTC)[reply]

Riemann P function

I have established, through manipulations of P symbols, that

.

I believe this shows that is a branch of the second P function above (is this correct?). I now have to find a linearly independent branch, which I am unsure of how to do. Can someone help me please? Thanks. meromorphic [talk to me] 12:17, 7 May 2012 (UTC)[reply]

Table problem

Given a table with 2 columns and n rows (n from 2 to 10), plus a sum for each row and a sum for each column, is it possible to fill the table, such that the values in each column add up to its sum, and for each row, the value in the first cell plus double the value in the second cell equals the sum for that row? If not possible, why? If possible, is there an efficient algorithm for filling the table with suitable values? Thanks. — Preceding unsigned comment added by 220.255.1.139 (talk) 15:12, 7 May 2012 (UTC)[reply]

You have 2n unknowns and only n+2 constraints, so yes, it is possible, but if n > 2 the solution is not unique. One approach is to put 0 in the second column for the first n-2 rows and put the sum of each row in the first column. Then you are left with 4 linear equations in 4 unknowns, which is not difficult to solve. Gandalf61 (talk) 16:51, 7 May 2012 (UTC)[reply]
A solution only exists if the first column sum plus twice the second column sum is equal to the total of all row sums. -- Meni Rosenfeld (talk) 20:42, 7 May 2012 (UTC)[reply]
So in fact if this condition mentioned by Meni Rosenfeld is satisfied, there is a linear dependence among the constraints and there are n-1 degrees of freedom. You can put 0 (or whatever values you like) in all but one of the entries in the second column (or all but one in the first column, or mix and match as you like), and the rest will be determined in the obvious way. Rckrone (talk) 22:11, 7 May 2012 (UTC)[reply]

Probability ball

Let be continuous, and such that , and Prove that, for all , there exist increasing, and , such that . Widener (talk) 22:07, 7 May 2012 (UTC)[reply]

Should there be an additional factor of 1/(N+1) on the sum? The step function should also integrate to 1. Rckrone (talk) 22:21, 7 May 2012 (UTC)[reply]
Ah, the thing is . Widener (talk) 22:25, 7 May 2012 (UTC)[reply]
Oh yeah, I forgot they are weighted by pj. For any non-negative integrable function f, you can find some bounded interval that has most of the weight. Then on that interval f is uniformly continuous so you can approximate it with these boxes of width δ. Since , the last box's height is determined by the previous ones, but the error there should be small since the the two functions both integrate to 1. I'm not sure where comes in, so maybe I'm missing something. Rckrone (talk) 22:33, 7 May 2012 (UTC)[reply]
That seems very informal - is that a correct proof? Can you make it more formal? Widener (talk) 23:24, 7 May 2012 (UTC)[reply]
Yes, I'm fairly sure that I can. ;) Rckrone (talk) 23:36, 7 May 2012 (UTC)[reply]
You can? Then why you haven't you shown me what it is already? And do you use ? Widener (talk) 23:55, 7 May 2012 (UTC)[reply]
Seriously, what's the answer? I can't figure it out and it's due in an hour. Widener (talk) 05:48, 8 May 2012 (UTC)[reply]
Ok I don't think I'm supposed to do this, but: Choose M so that . By uniform continuity of f on [-M,M], we can find δ such that for all |x - y| < δ, |f(x)-f(y)| < ε/6M. Now pick N+1 and the xis so the δ/2 balls cover [-M,M]. Choose each pi so that pi/δ the average value of f on [xi-δ/2, xi+δ/2]. Letting p be the step function defined by these values of pi, show that |f(x) - p(x)| < ε/6M for all x in [-M,M]. This is good but these pis don't sum to 1, so let and let p' be the step function defined by replacing p0 with p'0. Use the fact that and , to show that |p0/δ - p'0/δ| < ε/3δ. Then and finally Rckrone (talk) 06:21, 8 May 2012 (UTC)[reply]

How do you show completeness?

Show is a complete metric space (). Widener (talk) 23:39, 7 May 2012 (UTC)[reply]

More generally, how do you show that metric spaces are complete? Widener (talk) 00:26, 8 May 2012 (UTC)[reply]
A metric space is complete if every Cauchy sequence has a limit point in the space. I think here you need to show that if you have a Cauchy sequence of continuous functions, that (a) they converge point-wise, and (b) that the function described by these point-wise limits is continuous. Rckrone (talk) 00:34, 8 May 2012 (UTC)[reply]
All right; it is quite easy to show that every Cauchy sequence converges to some function under this metric. How do we know this limit is continuous? Hmm, is this convergence uniform perhaps? How would you show that? Widener (talk) 00:52, 8 May 2012 (UTC)[reply]
Yes, it's uniform. You can find an explicit bound on |x(t) - y(t)| for all t in terms of ρ(x,y). Rckrone (talk) 01:11, 8 May 2012 (UTC)[reply]