Content deleted Content added

Inline

Revision as of 23:09, 4 September 2010

Welcome to the mathematics section
of the Wikipedia reference desk.

skip to bottom

Select a section:

Shortcut

WP:RD/MA

Want a faster answer?

Main page: Help searching Wikipedia

How can I get my question answered?

Select the section of the desk that best fits the general topic of your question (see the navigation column to the right).
Post your question to only one section, providing a short header that gives the topic of your question.
Type '~~~~' (that is, four tilde characters) at the end – this signs and dates your contribution so we know who wrote what and when.
Don't post personal contact information – it will be removed. Any answers will be provided here.
Please be as specific as possible, and include all relevant context – the usefulness of answers may depend on the context.
Note:
- We don't answer (and may remove) questions that require medical diagnosis or legal advice.
- We don't answer requests for opinions, predictions or debate.
- We don't do your homework for you, though we'll help you past the stuck point.
- We don't conduct original research or provide a free source of ideas, but we'll help you find information you need.

Ready? Ask a new question!

How do I answer a question?

Main page: Wikipedia:Reference desk/Guidelines

The best answers address the question directly, and back up facts with wikilinks and links to sources. Do not edit others' comments and do not give any medical or legal advice.

August 29

linear transformation of standard normal random variables

$\mathbf {Y} =A\mathbf {X}$ where $\mathbf {X}$ is a vector of iid $N(0,1)$ random variables.

I need to show that $\mathbf {Y}$ is also iid $N(0,1)$ if $A$ is an orthogonal matrix.

I have the result that for $\mathbf {Y} =A\mathbf {X}$ we have $f_{\mathbf {Y} }(\mathbf {y} )={\frac {f_{\mathbf {X} }(A^{-1}\mathbf {y} )}{|A|}}$

With an orthogonal matrix substituted into the multivariate standard normal formula things cancel and I get a numerator that works out to be multivariate standard normal which is what I want.

But I'm left with that determinant on the bottom, which for an orthogonal matrix is 1 or -1. If it's 1, no problem. But if it's -1, does that mean my distribution is upside down? —Preceding unsigned comment added by 130.102.158.15 (talk) 02:40, 29 August 2010 (UTC)[reply]

Consider first the one-dimensional case. The transformation Y=−X swaps the positive and negative values of X. It does not turn the distribution upside down. Bo Jacoby (talk) 06:41, 29 August 2010 (UTC).[reply]

...which means that the denominator should actually be the absolute value of the determinant. -- Meni Rosenfeld (talk) 08:43, 29 August 2010 (UTC)[reply]

In this case, if the determinant of the matrix

A

is 1, I have

f_{y}(y)=f_{x}(y)

where

f_{x}

is a normal pdf...but if the determinant is -1, I get

f_{y}(y)=-f_{x}(y)

and that's not a normal pdf unless

-f_{x}(y)=f_{x}(-y)

, which I'm pretty sure it's not —Preceding unsigned comment added by 130.102.158.15 (talk) 08:44, 29 August 2010 (UTC)[reply]

Where did the minus sign come from? -- Meni Rosenfeld (talk) 09:46, 29 August 2010 (UTC)[reply]

aaaargh, sorry I had my notation mixed up and didn't realize that the denominator is the ABSOLUTE value of the determinant, I thought it was just the determinant! —Preceding unsigned comment added by 130.102.158.15 (talk) 21:55, 29 August 2010 (UTC)[reply]

Euler-Maruyama and Milstein schemes

Can anybody suggest where I might be able to find proofs of the order of convergence for the Euler-Maruyama and/or Milstein schemes used for stochastic differential equations? —Preceding unsigned comment added by Damian Eldridge (talk • contribs) 06:24, 29 August 2010 (UTC)[reply]

The Euler-Maruyama method article has a reference that you might look at. 67.119.3.248 (talk) 09:03, 30 August 2010 (UTC)[reply]

Trig Question

My days of battling with trig identities are unfortunately a little far behind me. Is it possible to express ${\frac {d}{b^{2}c^{2}-(a^{2}-b^{2})^{2}}}[(a^{2}-b^{2})(\cos(bt))+bc\sin(bt)]$ as $Acos(bt-\theta )$ ? Thanks asyndeton talk 16:33, 29 August 2010 (UTC)[reply]

Actually, maybe they're closer than I think. Is it just

A={\frac {d}{b^{2}c^{2}-(a^{2}-b^{2})^{2}}}

,

tan(\theta )={\frac {a^{2}-b^{2}}{bc}}

? asyndeton talk 16:45, 29 August 2010 (UTC)[reply]

Certainly a linear combination of sine and cosine functions of the same period can be expressed as a single cosine function with that same period and with a phase shift. More later...... Michael Hardy (talk) 16:48, 29 August 2010 (UTC)[reply]

More specifically, see List_of_trigonometric_identities#Linear_combinations. Michael Hardy (talk) 17:09, 30 August 2010 (UTC)[reply]

To quote:

In the case of a linear combination of a sine and cosine wave^[1] (which is just a sine wave with a phase shift of π/2), we have

a\sin x+b\cos x={\sqrt {a^{2}+b^{2}}}\cdot \sin(x+\varphi )\,

where

\varphi ={\begin{cases}\arcsin \left({\frac {b}{\sqrt {a^{2}+b^{2}}}}\right)&{\text{if }}a\geq 0,\\\pi -\arcsin \left({\frac {b}{\sqrt {a^{2}+b^{2}}}}\right)&{\text{if }}a<0,\end{cases}}

or equivalently

\varphi =\arctan \left({\frac {b}{a}}\right)+{\begin{cases}0&{\text{if }}a\geq 0,\\\pi &{\text{if }}a<0.\end{cases}}

Michael Hardy (talk) 17:11, 30 August 2010 (UTC)[reply]

http://en.wikipedia.org/wiki/Wikipedia:RDMA#Semicircles_problem

here Michael Hardy said that "if a line passes through the centers of two circles in the same plane that touch each other, then it must pass through the point where they touch each other". I can see why this is for externally tangent circles, but why is it true for internally tangent circles as well? in other words, how do you know that the line from the center of the big circle passes through the point the passes through the center and the point of tangency? —Preceding unsigned comment added by 68.248.229.115 (talk) 18:17, 29 August 2010 (UTC)[reply]

The circles have a common tangent at the point where they touch. The lines from the centre of each circle to this point meet the tangent line at right angles, hence they coincide. Or just appeal to the symmetry of the layout. →86.132.161.214 (talk) 22:45, 29 August 2010 (UTC)[reply]

Or just reflect the inner circle in the common tangent, producing the case of externally tangent circles.→86.132.161.214 (talk) 19:50, 30 August 2010 (UTC)[reply]

August 30

Graph question

What is the function for a graph that starts at x=a, y=k, which begins to rise slowly then much more steeply until nearly perpendicular to the x axis, then cuts off at x', and resumes in a mirror image at x'', descending until it reaches y=k again? (Apologies for using non-standard notation, I'm afraid I don't know any better. (The shape I'm envisioning is sort of like the elevation of a cooling tower at a power plant.) I suppose it's a hyperbola, but I looked at that article and it's too much information for me to wade through.--82.113.121.52 (talk) 18:39, 30 August 2010 (UTC)[reply]

Just cutting off

y=x^{-2}

at suitable

\pm a,\pm b

(that is, defining it only when

|x|\in [a,b]

) would do something like that — choose a and b to get the right amount of "slowly/nearly vertical" and the right amount of separation between the two branches. --Tardis (talk) 21:43, 30 August 2010 (UTC)[reply]

Taking

a={\frac {1}{2}},b=2

seems to be a nice choice. Cooling tower says that they are hyperboloid, so the cross section is a hyperbola: in this case, it's

y=-{\sqrt {x^{2}-1}}

, which becomes completely vertical at

x=\pm 1

. Replace x and y with

a_{x}x-b_{x}

and

a_{y}y-b_{y}

to scale/shift to taste. --Tardis (talk) 21:56, 30 August 2010 (UTC)[reply]

Dear Dr. Who, thank you! Reason I was asking was, I read an article about a speculative theory that attempts to explain why gravity is so much weaker a force than, say, electromagnetism. According to the theory, if our universe is envisioned as a two-dimensional sheet lying flat then other universes are stacked on both sides like pancakes and gravity dissipates by becoming spread out among universes. I was trying to picture in my mind how universes could interact gravity-wise without them slamming into each other and hit on the idea of a graph that rises from nothingness asymptotically and, after an interval, reverses back to nothing. The graph is made by one function so the two halves are in fact a single entity but there is a break in between them. Hmm. On second thought, maybe part 1 and part 2 should switch places. A few more, if I may: (1) Why are so few of us mathematically talented? (2) I am still heartbroken about Rose Tyler, she should still be your companion! (3) Do you ever tire of stupid questions from earthlings?--82.113.106.31 (talk) 01:34, 31 August 2010 (UTC)[reply]

I'm not he, of course, but the psychic paper tells me that he read this message in 2045:

That's what's so brilliant! You can't become a talented ballet dancer or a talented werewolf, but anyone can learn mathematics because it's all written down!
Aren't we all — but think of the other you over there to whose world she was added.
Apparently not: I keep coming back for more.
...Four things: just "the Doctor."

--Tardis (talk) 02:44, 1 September 2010 (UTC)[reply]

Magic Numbers

Can someone explain why there are "magic numbers", and particularly why they work.

I was thinking of the number: 142857. When multiplied by 7 and multipiles thereof produce unusual results, or indeed of any non-multipiles-of-seven.

When multiplied by 55 and then 7, look at the result! Why? MacOfJesus (talk) 19:21, 30 August 2010 (UTC)[reply]

There's an article on this at Cyclic number. 85.226.205.150 (talk) 20:02, 30 August 2010 (UTC)[reply]

1/7 has the decimal value 0.142857142857142..., so 142857 is just below 1000000/7. Multiplying by 7 gives a whole number just below 1000000. Similarly, 1/13 is 0.076923076..., so 76923 shows the same result when multiplied by 13.→86.132.161.214 (talk) 20:13, 30 August 2010 (UTC)[reply]

Thank you. MacOfJesus (talk) 21:11, 30 August 2010 (UTC)[reply]

The Cyclic number article mentions a number of restrictions:

This restriction also excludes such trivial cases as:

repeated digits, i.e.: 555
repeated cyclic numbers, i.e.: 142857142857
single digits preceded by zeros, i.e.: "005"

I understand the first two, but how is #3 problematic? 005 * 2 = 010, so it doesn't seem to be cyclic, trivially or not. -- ToE^T 00:14, 31 August 2010 (UTC)[reply]

Is it because "Zero" "0" is used by us so differently? "Zero" can mean "Zilch" or that there is no value in that space, such as the difference between "ten" and "one", a zero!? MacOfJesus (talk) 11:37, 31 August 2010 (UTC)[reply]

I don't buy that. The article explicitly addresses the way leading zeros are handled. -- ToE^T 13:05, 31 August 2010 (UTC)[reply]

Anything like "00005" is trivially cyclic because the cyclic permutations are "50000", "05000", "00500", "00050", and "00005". All of these are multiples of 5. (The fact that 005 * 2 = 10 doesn't mean it's non-cyclic. Cyclic means the permutations are all multiples, not that all multiples are permutations.) Staecker (talk) 13:24, 31 August 2010 (UTC)[reply]

But according to the article, cyclic numbers specifically deal with "successive multiples". -- ToE^T 13:30, 31 August 2010 (UTC)[reply]

FWIW, the three restrictions were in place when the article burst forth fully formed from the prolific keyboard of 198.99.123.63. -- ToE^T 13:38, 31 August 2010 (UTC)[reply]

Ah OK- I missed the bit about "successive". Reading more carefully now... In the bit you quoted above, "This restriction" refers to the "successive multiples" condition. So those three categories are examples of numbers which would trivially be cyclic were it not for the "successive multiples" condition. So "0005" trivially satisfies all properties of cyclic numbers except for the "successive multiples" condition, as you said. I'll try to clarify this a bit in the article. Staecker (talk) 11:57, 1 September 2010 (UTC)[reply]

I too noted the "This restriction" part, and was going to address that after figuring out #3. I disagree with your interpretation and your edit. The "consecutive multiples" condition dose not lead to restrictions #1 and #2 -- they still need to be explicitly excluded as trivial cases. I suspect that you were close to the mark when you mentioned that #3 would need to be exclude as trivial cases of numbers whose permutations are (non-consecutive) multiples. Perhaps one of the sources used that definition of cyclic number. Anyone here have a copy of TPDoCaIN to see if they define it differently than the Wolfram page does? -- ToE^T 00:54, 2 September 2010 (UTC)[reply]

Well, repeated digits (#1) do give multiples when you make the permutations: "555" has permutations "555" and "555", each of which is 555 * 1; so, were it not for the "successive" condition, this would be cyclic. For (#2), the example "142857142857" does indeed give multiples when you make permutations, but they are not successive (e.g. 428571428571 = 142857142857 * 4). So this one too would be cyclic were it not for the "successive" multiples. Staecker (talk) 11:22, 3 September 2010 (UTC)[reply]

Staecker meant to type 142857142857 * 3 in the example above. -- ToE^T 16:36, 3 September 2010 (UTC)[reply]

With #2:

142857142857 × 1 = 142857142857

142857142857 × 2 = 285714285714

142857142857 × 3 = 428571428571

142857142857 × 4 = 571428571428

142857142857 × 5 = 714285714285

142857142857 × 6 = 857142857142

which are successive multiples (and are all six distinct cyclic permutations that exist for that twelve digit number). With #1, 555 only has one distinct cyclic permutation and the single multiple 1 is trivially successive. I'm not trying to pick nits here, but this is why such trivial cases are explicitly excluded. -- ToE^T 16:36, 3 September 2010 (UTC)[reply]

I've made the appropriate changeto the article, explained (and with a link back here) at Talk:Cyclic number#Trivial cases. -- ToE^T 23:41, 5 September 2010 (UTC)[reply]

Thanks to all. MacOfJesus (talk) 17:11, 1 September 2010 (UTC)[reply]

August 31

conditional probabilities

Given $P(A)=0.5$
and $P(A|B)=0.1$
and $P(A|C)=0.9$
is it possible to find $P(A|B,C)$ ?115.178.29.142 (talk) 01:20, 31 August 2010 (UTC)[reply]

No. Here is one extreme case consistent with the information given above:

The sample space consists of five points: v, w, x, y, z, with respective probabilities 1/20, 1/20, 8/20, 1/20, 9/20, and the events are

A = {w, z}

B = {w, x, y}

C = {y, z}

Then P(A) = 1/20 + 9/20 = 0.5,

P(A | B) = (1/20)/(1/20 + 8/20 + 1/20) = 0.1

P(A | C) = (9/20)/(9/20 + 1/20) = 0.9

and P(A | B & C) = 0.

Here is the opposite extreme case, also consistent with the information given above:

The sample space consists of five points: v, w, x, y, z, with respective probabilities 1/20, 1/20, 9/20, 8/20, 1/20, and the events are

A = {v, w, y}

B = {w, x}

C = {w, y, z}

Then P(A) = 1/20 + 1/20 + 8/20 = 0.5

P(A | B) = (1/20)/(1/20 + 9/20) = 0.1

P(A | C) = (1/20 + 8/20)/(1/20 + 8/20 + 1/20) = 0.9

and P(A | B & C) = 1.

Michael Hardy (talk) 02:29, 31 August 2010 (UTC)[reply]

I can't really add anything to Michael Hardy's excellent example, but I'll just say that, intuitively, your information tells you about the relationship between A & B, and also between A & C, but doesn't tell you the connection between B & C. (In terms of Venn diagrams think of overlap.) Dbfirs 19:29, 31 August 2010 (UTC)[reply]

Which experiment is the best to perform?

You have a hypothesis that you wish to test. Based on your current knowledge, you evaluate the probability that the hypothesis is true as 50%. You can choose to perform one of three experiments. You have already previously evaluated the expected probability that the hypothesis is true after performing the experiment. For experiment 1, this is 30%. For experiment 2, 60%, and experiment 3, 90%. Which experiment is expected to best perform in getting the most accurate reading for the truth/falsity of the hypothesis?--Alphador (talk) 07:13, 31 August 2010 (UTC)[reply]

I'm not sure this question makes sense. Shouldn't the a priori expected probability that the hypothesis is true after the experiment also be 50%? An experiment will have multiple outcomes, and we could evaluate the probability that the hypothesis is true given each of the outcomes, and also evaluate the probabilities of each outcome based on our current knowledge. But if the weighted average of these probabilities is other than 50%, wouldn't that give us a different probability for the hypothesis based on our current knowledge? —Preceding unsigned comment added by 203.97.79.114 (talk) 09:45, 31 August 2010 (UTC)[reply]

This doesn't make sense to me either. It is contradictory to say that according to your current knowledge the probability is 50%, yet you also know in advance that performing an experiment will change this to some other value. 86.173.36.196 (talk) 14:10, 31 August 2010 (UTC)[reply]

Perhaps our article on Hypothesis testing might clarify your thinking. You might also be interested in Power of a test and also Type I error and Type II error. Dbfirs 19:14, 31 August 2010 (UTC)[reply]

I thought the idea was to figure out the information gain from each experiment, and pick the highest one. 67.122.209.135 (talk) 20:59, 1 September 2010 (UTC)[reply]

That would be possible if we were given sufficient data. The data given by the OP is not only insufficient, but also inconsistent. -- Meni Rosenfeld (talk) 09:21, 2 September 2010 (UTC)[reply]

The law of total probability says:

The prior expected value of the posterior probability is equal to the prior probability.

Michael Hardy (talk) 23:04, 1 September 2010 (UTC)[reply]

May I ask a silly question; regarding the Calandar and how we work-out rates to pay ?

I am finding myself trying to explain to well educated people on this subject, but are not Maths. / Accountancy perhaps. I have been trying to explain that a rate of rent per calandar month is a payment of twelve per year and so is to 12 * 4 = 48 weeks. A rate per year is to 52 weeks. Yet a Company is trying to fit one into soft-ware that does not fit. Am I wrong? MacOfJesus (talk) 11:13, 31 August 2010 (UTC)[reply]

If you pay monthly then there are twelve payments a year, but this doesn't mean there are 48 weeks in a year. If you pay yearly then there is one payment a year, regardless of whether there are exactly 52 weeks in a year (which there aren't). 86.173.36.196 (talk) 14:06, 31 August 2010 (UTC).[reply]

Thank you. I was being over-simplistic, for now I have (The un-enviable task) of telling Bosses this! MacOfJesus (talk) 17:00, 31 August 2010 (UTC)[reply]

When I ran a weekly and monthly payroll, the yearly pay was divided by 52.142857 to obtain the weekly pay. This meant that those on weekly pay were actually paid for an extra day every leap year, and thus earned slightly more than those who elected for monthly pay. Dbfirs 19:04, 31 August 2010 (UTC)[reply]

But was that 12 monthly payments per year or 13? Or did it just roll-over from one year to the next? And, would it not be better to divide it by 365, or 365.25, the yearly sum, and then multiply it my the appropriate number of days for the weekly packet? MacOfJesus (talk) 21:18, 31 August 2010 (UTC)[reply]

Monthly payments (in the UK at least) are paid on or before a fixed date of the month, so there are 12 in a year. Weekly payments are always exactly seven days (usually paid on Friday), so there is no advantage in dividing by 365.25 because it gives exactly the same result. Dbfirs 07:51, 1 September 2010 (UTC)[reply]

But people don't work 7 days a week, usually. Mostly 5 or 6. And, bosses are miserable with wage-paying! MacOfJesus (talk) 08:09, 1 September 2010 (UTC)[reply]

One solution to the whole problem is just to agree on an hourly rate, then record the number of hours worked in any convenient period. The problem with this method is that it usually cheats employees out of paid bank holidays, paid holiday, and paid sick-leave (though the first two of these these can be incorporated in the hourly rate by an enlightened employer, and sick-leave can be paid at an "average rate"). Dbfirs 08:45, 3 September 2010 (UTC)[reply]

Each calendar month has on average 365.25/12 = 30.4375 days in it, not the 28 days you implied above. And on average there are 365.25/7 = 52.17857143 weeks in the year. Each year consists of 52 weeks plus one day plus any leap day. 92.15.3.64 (talk) 16:47, 2 September 2010 (UTC)[reply]

You simply need to explain that things don't fit quite right, therefore we have to approximate.

You have estimates and exact values. there are 365.2422 days in a year, we arbitrary divide it into 12 months, giving approximately 30 days to a month, the exact value being 365.2422/12. So a given month gives approximately 4.2 weeks or 365.2422/7. A real world example of approximations is tax rates. if you tax rate is 10% what is the tax on USD 1.01? and where does the 10% of USD$0.01 go? To answer your question, you are wrong. Your calculation for weeks should be 4.2 ( You need to increase your accuracy to give you balances to the nearest cent, which means at least a few more digits of accuracy. There is approximately 52.17745 weeks in a year, or approximately 4.348 weeks in a month, and exactly 7 days in a week. Let me rephraise your question if you please: " I have been trying to explain that a rate of rent per calendar month ( is the rate that would give you 12 payments over 365.2422 days or one sidereal year ) is a payment of twelve per year ( exactly by definition ) and so is to 12 ( exactly ) * 4 ( 4.333... ) = 48 ( 52 exactly ) weeks. A rate per year is to 52 ( approximatly ) weeks. ( 52.17745 exactly )

Blame it on the earth, not revolving exactly 360 days. It will someday as it slows, but until then we have to approximate. ( and btw, that extra 5.2422 days is not silly to a USD$120,000 month rent ) —Preceding unsigned comment added by 69.232.209.57 (talk) 20:21, 2 September 2010 (UTC)[reply]

Random permutation

Initially:

a(1) = 1
a(2) = 2
...
a(n) = n

Then:

for i = 1 to n
r = random number between 1 and n (inclusive)
swap the values of a(i) and a(r)
next

Question: will this produce a uniformly random permutation of 1,2..n in a() (i.e. all permutations equally likely)? If not, is there a simple tweak that will fix it so it does? —Preceding unsigned comment added by 86.173.36.196 (talk) 14:01, 31 August 2010 (UTC)[reply]

No, it won't: you have

n^{n}

possibilities for your set of random numbers and

n!

for your output orders.

\forall n>2\;n!\!\not |\,n^{n}

so it can't be uniform. See Knuth shuffle. --Tardis (talk) 14:20, 31 August 2010 (UTC)[reply]

Knuth shuffle is unbiased. If the random number generator is also unbiased, why won't the result be uniform? -- kainaw ™ 14:32, 31 August 2010 (UTC)[reply]

Because the OP's algorithm does not do Knuth shuffle. Knuth shuffle is the "simple tweak that will fix it".—Emil J. 14:39, 31 August 2010 (UTC)[reply]

Wow. After so many years of grading programming homework, I feel ashamed that I overlooked that. -- kainaw ™ 02:10, 1 September 2010 (UTC)[reply]

The OP's biased algorithm is discussed in the second paragraph of Fisher–Yates shuffle#Implementation errors. -- ToE^T 15:04, 31 August 2010 (UTC)[reply]

Elementary number theory problem

Resolved

Let $n\geq 2$ and k be any positive integers. Prove that $(n-1)^{2}|(n^{k}-1)$ if and only if $n-1|k$ . The hint attached to the problem in the book says, let $n^{k}=((n-1)+1)^{k}$ . I started with the hint, and saw that (n-1) divides $n^{k}-1$ always. Now I cant think of anything else to do. I feel the hint was supposed to send me in another direction. Can someone help please. Thanks -Shahab (talk) 15:35, 31 August 2010 (UTC)[reply]

You can continue in the same direction: compute an explicit expression for m = (n^k − 1)/(n − 1) (hint: think about these as polynomials in variable n), and determine when n − 1 | m (hint: if p(x) is an integer polynomial and n and c are integers, then p(n) ≡ p(c) mod (n − c)).—Emil J. 15:51, 31 August 2010 (UTC)[reply]

(edit conflict) Use the binomial theorem to expand

(1+(n-1))^{k}

. Subtract 1 and divide by

n-1

and you are left with a sum of multiples of powers of

n-1

and a constant term. All the multiples of powers of

n-1

are obviously divisible by

n-1

. What is the value of the constant term ? Gandalf61 (talk) 15:55, 31 August 2010 (UTC)[reply]

@Emil: I have used your hint as follows:

(n^{k}-1)=(n-1)(n^{k-1}+\cdots +1)

and so taking

p(x)=x^{k-1}+\cdots +1

,

p(n)=m

, and c=1 in your hint we have

(n-1)|(m-k)

. Now suppose

(n-1)|k

. Clearly then

(n-1)|m

and so

(n-1)^{2}|m(n-1)

as required. Supposing

(n-1)^{2}|m(n-1)

gives us

(n-1)|m

and combined with

(n-1)|(m-k)

this yields

(n-1)|k

. I hope this is all correct.

@Gandalf61: The constant term is k, but I dont understand your approach. Could you be more explicit. Thanks-Shahab (talk) 16:21, 31 August 2010 (UTC)[reply]

Gandalf's suggestion is actually much simpler than mine. Just expand n^k − 1 = (1 + (n − 1))^k − 1 in terms of powers of n − 1 using the binomial theorem. The ith powers for i ≥ 2 are always divisible by (n − 1)², and the 0th power cancels out, which only leaves us with the term involving the 1st power, which you computed to be k(n − 1). Then it should be obvious when is this divisible by (n − 1)².—Emil J. 16:37, 31 August 2010 (UTC)[reply]

Oh yes. Why do many problems appear extremely simple once the solution is presented! Thanks for all the help-Shahab (talk) 16:54, 31 August 2010 (UTC)[reply]

September 1

Rotation matrix: counterclockwise vs. clockwise

Resolved

Hello, I am trying to teach myself algebraic matrices. I seem to understand them, and am getting right answers, except in one particular case. For the life of me, I can't understand why the matrix for a 90-degree counterclockwise rotation is what it is.

We are told (for example at Rotation matrix) that the 90-degree counterclockwise rotation matrix is

R(90^{\circ })={\begin{bmatrix}0&-1\\[3pt]1&0\\\end{bmatrix}}

but to me that looks like a clockwise rotation. Consider the illustration at Matrix_(mathematics)#Interpretation_as_a_parallelogram, which represents the transform of the unit square by the matrix

A={\begin{bmatrix}a&b\\c&d\end{bmatrix}}\,

If we plug in the numbers from the above counterclockwise rotation matrix, then the (a,b) vertex — originating at (1,0) as a corner of the unit square — is transformed to (0, –1). No? Meanwhile the (c,d) vertex, originating at (0,1), is transformed to (1,0). To me that produces a clockwise rotation: The unit square has been rotated such that it is now below the x-axis but still to the right of the y-axis. What gives?

I get the same result when I multiply the row vectors of the unit square [0 0], [1 0], [1 1], [0 1] (also from Matrix_(mathematics)#Interpretation_as_a_parallelogram) by the 90-degree counterclockwise rotation matrix above. This seems to produce the vertices [0 0], [0 –1], [1 –1], [1 0] — which again describes a clockwise rotation below the x-axis.

What am I misunderstanding about this interpretation of the rotation matrix? It is particularly vexing since I've had success with every basic type of transform, except for rotations greater or less than 180 degrees. Help is greatly appreciated. -Jordgette (talk) 00:16, 1 September 2010 (UTC)[reply]

My guess is that the mistake is in your matrix multiplication. Using your matrix A,

A{\begin{bmatrix}1\\0\end{bmatrix}}={\begin{bmatrix}a\\c\end{bmatrix}}

. Perhaps you were calculating

{\begin{bmatrix}1&0\end{bmatrix}}A={\begin{bmatrix}a&b\end{bmatrix}}

? The convention is to use column vectors and (therefore) multiply with the matrix on the left and the vector on the right. Given A and v,

\mathbf {v} ^{\mathrm {T} }A=(A^{\mathrm {T} }\mathbf {v} )^{\mathrm {T} }

, so reversing the multiplication (and transposing v to make the multiplication work then) is equivalent to transposing A, which in your case makes it equal to -A, so of course it generates the negative of the vector that you want (which is equivalent to a further rotation by 180° and changes 90° to -90°). --Tardis (talk) 02:13, 1 September 2010 (UTC)[reply]

I was multiplying a row vector on the left and the matrix on the right. That would be my mistake. Thank you for taking the time. -Jordgette (talk) 07:04, 1 September 2010 (UTC)[reply]

Convert procedure to formula?

How do I convert something like: if(x>50) y=(30+x), else y=(x/2), to an equation of the form y = f(x)?

Is this a standard 'area' of mathematics, if so what is it called, and where can I go online to learn it?

I have also seen precedures that involve for/while loops that I know should be representable as a single mathemtical function and shouldn't need interation. But I have no clue how to build a function from the procedure. Can anyone point me in the right direction? Thanks.--Dacium (talk) 03:27, 1 September 2010 (UTC)[reply]

To answer the first part of your question, it looks like what you want is a Piecewise function... the value of x (i.e., where it is in the domain) determines which piece of the function you're using to evaluate the dependent variable:
$f(x)={\begin{cases}x/2,&{\mbox{if }}x\leq 50\\30+x,&{\mbox{if }}x>50\end{cases}}$ .
Essentially it's saying the same thing that you've expressed as an "if-else" or another programming function that involves selecting cases based on the input. --Kinu ^t/_c 03:50, 1 September 2010 (UTC)[reply]

Selecting cases is avoided by using Iverson brackets: y=[x>50]·(30+x)+[x≤50]·(x/2). For-loops are avoided by using J. Bo Jacoby (talk) 05:58, 1 September 2010 (UTC).[reply]

You could also use Recursion to avoid for loops.

For some for loops it may be posible to replace the loop by a simple formula without recursion. For instance sum =0; for i=1 to n: sum=sum+i; is an arithmetic progression and could be replace by the simple function

{\frac {n(n-1)}{2}}

. Similar results could be obtained for other loops but I'm not aware of a general procedure, the article Series (mathematics) might be of help.--Salix (talk): 08:45, 1 September 2010 (UTC)[reply]

Corresponding J expressions may be written

   n=.10
   +/1+i.n NB. brute force summation without for loop
55
   n*(n+1)%2 NB. computing the sum
55

Salix alba ment to write $\scriptstyle {\frac {n(n+1)}{2}}$ . Bo Jacoby (talk) 10:08, 1 September 2010 (UTC)[reply]

Another number theory problem

Resolved

I have two problems:

If p is an odd prime, show that every prime divisor of $2^{p}-1$ must be of the form 2pk+1 for some natural number k.
If $f_{n}$ are the Fibonacci numbers then show that Euclid's algorithm takes n steps to determine gcd $(f_{n+2},f_{n+1})$ . I have been racking my brain on this one (the first one I have no idea about), using induction etc but to no avail. -Shahab (talk) 13:05, 1 September 2010 (UTC)[reply]

The second one's pretty trivial: the Euclidean algorithm constructs a sequence of numbers, whose first two elements are f_n+2 and f_n+1. What is the third element?—Emil J. 13:11, 1 September 2010 (UTC)[reply]

As for the first one: you want to show that

2^{p}-1\equiv 1{\pmod {2p}}

, or in other words,

2^{p}\equiv 2{\pmod {2p}}

. You can work mod 2 and mod p separately, as p is odd. The rest is Fermat's little theorem.—Emil J. 13:19, 1 September 2010 (UTC)[reply]

Thanks Emil for part 2. But for the 1st part I dont understand as to how you concluded that I wished to show

2^{p}-1\equiv 1{\pmod {2p}}

or equivalently

2^{p}-1=2kp+1

for some k. Could you be a little more explicit. Thanks-Shahab (talk) 14:05, 1 September 2010 (UTC)[reply]

Sorry, I misread the question. So, we have a prime divisor q | 2^p − 1, and we want to show that

q\equiv 1{\pmod {2p}}

. It's clear mod 2, so we only need it mod p. Now, consider the multiplicative order k of 2 modulo q. The assumption gives k | p, and Fermat's little theorem gives k | q − 1. Conclude p | q − 1.—Emil J. 14:16, 1 September 2010 (UTC)[reply]

Thanks. I have another problem, (sorry for posting so many, I am appearing for an exam after a long long time). If gcd(a,b,c)lcm[a,b,c]=abc then show that (a,b)=(b,c)=(c,a)=1.-Shahab (talk) 21:33, 1 September 2010 (UTC)[reply]

Hi Shahab. For any prime

p,

let the nonnegative integers

\alpha

\beta

\gamma

be the exponents of the greatest powers of

p

that divide

a,

b,

respectively

c

. Then

\min(\alpha ,\beta ,\gamma )+\max(\alpha ,\beta ,\gamma )=\alpha +\beta +\gamma

, and from this you have to deduce that at most one out of

\alpha ,\beta ,\gamma

is positive (meaning that

p

divides at most one out of

a,b,c

). You may think w.l.o.g. that

\scriptstyle 0\leq \alpha \leq \beta \leq \gamma ,

of course.--pm a 00:38, 2 September 2010 (UTC)[reply]

Thanks pma. I hope you are doing well.

Name of a curve

Say A is an arbitrary smooth curve, and P is some point on that curve. I draw a line normal to A through P, and I mark off point Q on the line such that PQ is some constant distance. I then trace the locus of Q as P moves along A, thus forming a new curve B. Is there a name for the relationship between curves A and B? I vaguely had in mind "evolute", but looking that up I see it means something completely different.—Preceding unsigned comment added by 86.135.28.150 (talk) 13:59, 1 September 2010 (UTC)[reply]

B is more or less a parallel curve of A.—Emil J. 14:07, 1 September 2010 (UTC)[reply]

Yes, it is indeed a parallel curve. I have added this ref (the first) to Parallel curve. DVdm (talk) 14:21, 1 September 2010 (UTC)[reply]

Thanks (duh). 86.184.27.6 (talk) 17:14, 1 September 2010 (UTC)[reply]

September 2

find scalar values a, b and c

Given $\mathbf {A} ={\begin{bmatrix}2&-5\\3&1\\\end{bmatrix}}$ , find scalar values a, b, c (NOT ALL ZERO) for which $a\mathbf {I} +b\mathbf {A} +c\mathbf {A} ^{2}=\mathbf {O}$
I get these 4 simultaneous equations in 3 unknowns:
$a+2b-11c=0$
$-5b-15c=0$
$3b+9c=0$
$a+b-14c=0$
What do I do now? Wikinv (talk) 01:41, 2 September 2010 (UTC)[reply]

Check your initial work. I believe one of the four equations is wrong. Next, take a look at Simultaneous equations for various techniques for solving these. One approach is to rearrange one of the equations to isolate one of the unknowns (i.e.,

a=...

) and then substitute the results into the next equation. After the second iteration, you should have a value for one of the unknowns. Repeat the process until you've solved them all. -- Tom N (tcncv) talk/contrib 02:09, 2 September 2010 (UTC)[reply]

After attempting to solve it myself, I found that there's a gotcha in those equations. The problem has a solution, but that solution may not be unique. I assume this is homework, so I will not give you too obvious a hint. Write back if you need additional help. -- Tom N (tcncv) talk/contrib 02:33, 2 September 2010 (UTC)[reply]

Fixed the equation. I was aware that there are multiple solutions, indeed that is why I don't know how to solve it.--Wikinv (talk) 07:05, 2 September 2010 (UTC)[reply]

Although, having fixed the equation, the solution becomes quite easy by inspection, but is there an analytic way of solving it?--Wikinv (talk) 07:09, 2 September 2010 (UTC)[reply]

Just realised that the second and third equations are in fact exactly the same. Furthermore, it is evident that there is an infinite number of solutions (that's why it was so easy to solve by inspection!)--Wikinv (talk) 07:14, 2 September 2010 (UTC)[reply]

Observe that if (a,b,c) is a nonzero solution to the equation

\scriptstyle a\mathbf {I} +b\mathbf {A} +c\mathbf {A} ^{2}=\mathbf {O}

where

\scriptstyle \mathbf {A}

is any square matrix, and k is a nonzero number, then (ka,kb,kc) is a nonzero solution too. Bo Jacoby (talk) 05:15, 2 September 2010 (UTC).[reply]

In general, the equation

a\mathbf {I} +b\mathbf {A} +c\mathbf {A} ^{2}=\mathbf {O}

has a one-dimensional space of solutions (a, b, c) for any 2x2 matrix

\mathbf {A}

. To see this, let

\mathbf {A} ={\begin{bmatrix}a_{11}&a_{12}\\a_{21}&a_{22}\\\end{bmatrix}}{\text{ and }}\mathbf {A'} ={\begin{bmatrix}a_{22}&-a_{12}\\-a_{21}&a_{11}\\\end{bmatrix}}

Then

\mathbf {A'A} =\det(\mathbf {A} )\mathbf {I}

but

\mathbf {A'} =\mathrm {tr} (\mathbf {A} )\mathbf {I} -\mathbf {A}

so

(\mathrm {tr} (\mathbf {A} )\mathbf {I} -\mathbf {A} )\mathbf {A} =\det(\mathbf {A} )\mathbf {I}

\Rightarrow \det(\mathbf {A} )\mathbf {I} -\mathrm {tr} (\mathbf {A} )\mathbf {A} +\mathbf {A} ^{2}=\mathbf {0}

so

(a,b,c)=(\det(\mathbf {A} ),-\mathrm {tr} (\mathbf {A} ),1)

or (as Bo says) any multiple of this. Gandalf61 (talk) 08:54, 2 September 2010 (UTC)[reply]

Alternatively, just take the coefficients of the characteristic polynomial. —Preceding unsigned comment added by 203.97.79.114 (talk) 09:11, 2 September 2010 (UTC)[reply]

See Gaussian elimination. Properly used, it works also for systems of linear equations that have no solutions, multiple solutions, and\or redundant equations. I don't know if our articles cover the details involved, but if not, any introductory linear algebra book will. -- Meni Rosenfeld (talk) 09:17, 2 September 2010 (UTC)[reply]

Numerical analysis notation

I was reviewing my numerical analysis textbook, and stumbled across a notation I didn't remember, concerning propagation of errors. For background, the question is, assuming that $f:\mathbb {R} \rightarrow \mathbb {R}$ is a differentiable function, and that ${\overline {x}}$ is an approximation of $x\in \mathbb {R}$ , what is an upper bound on the uncertainty $|f({\overline {x}})-f(x)|$ ? Now, the mean value theorem directly gives that there is a $\xi$ between $x$ and ${\overline {x}}$ such that $|f({\overline {x}})-f(x)|=|f'(\xi )||{\overline {x}}-x|$ . We know that $|{\overline {x}}-x|\leq M$ , but we do not know an upper bound on $\textstyle |f'(\xi )|$ .

Now, what the book does here is that it replaces the unknown $\textstyle f'(\xi )$ with the known $f'({\overline {x}})$ . The idea, presumably, is that since $|{\overline {x}}-\xi |$ is small, then so is $|f'({\overline {x}})-f'(\xi )|$ . The book tacitly makes this assumption, mind you, without requiring a single fact about $\textstyle f'$ —not even that it be continuous! Of course, it is not then necessarily true that $|f({\overline {x}})-f(x)|\leq |f'({\overline {x}})|M$ , and the book does not claim so, but replaces the $\leq$ sign with a $\lesssim$ sign, telling the reader to read it as "less than or approximately equal to"—I would say it seems reasonable to interpret this as "unequality that almost certainly holds unless the function is too irregular". It feels kind of sloppy though. Is it just my book or is this common notation? Is this a case of "since we're dealing with applications, we don't have to care about badly-behaving functions"? 85.226.206.114 (talk) 07:57, 2 September 2010 (UTC)[reply]

Trying to find an estimate for

\textstyle f'(\xi )

gets you back to the original problem but with derivative of the function. So this estimate will be in terms of

\textstyle f''

and if you try it estimate this it will be in terms of

\textstyle f'''

etc. But each time you're multiplying the estimate by

|{\overline {x}}-x|

which is assumed to be small. So I think the interpretation of

\lesssim

is "Less than equal up to a quantity that is small compared to the other quantities." You can construct examples where the derivatives get very large or infinite so that the error term is still large even with the

|{\overline {x}}-x|

factors, but these aren't encountered often in practice. The

\lesssim

notation is a bit vague perhaps but if you want more precise notation try a different numerical analysis book, there is no shortage of them.--RDBury (talk) 15:05, 2 September 2010 (UTC)[reply]

All right, thanks! 85.226.205.5 (talk) 08:42, 3 September 2010 (UTC)[reply]

Drawing 45-45-90 triangles on spheres

Can you draw a 45-45-90 triangle on a sphere so all three sides have whole-number lengths? 20.137.18.50 (talk) 16:31, 2 September 2010 (UTC)[reply]

You cannot draw a 45–45–90 triangle on a sphere at all. The sum of angles of any triangle on a sphere is strictly more than 180°, see spherical geometry.—Emil J. 16:36, 2 September 2010 (UTC)[reply]

September 3

Standard deviation

Hi all! In physics we're doing a bit of stats and I noticed in the standard deviation formula they divide by N-1 rather than just N. I asked my teacher and he said he didn't get it either, and look it up on Wikipedia or something like that, so here I am. I tried looking at your articles Standard_deviation and Bessel's correction, but that didn't really help because I don't have a university-level stats background :/ Can someone who does explain why you divide by N-1, in simpler terms? I'm OK with (and even expect you to) dumb the concept down a little --cc —Preceding unsigned comment added by 76.229.208.208 (talk) 01:58, 3 September 2010 (UTC)[reply]

As I understand it, the N-1 come in because you are trying to estimate the actual standard deviation based on sample data. If you put N in the denominator it turns out that the estimate will, on average, be too low. So a correction factor is built into the formula so that the estimate will average to the actual value if the experiment is repeated many times. When the correction factor is added it works out the same as using N-1 in the denominator instead of N. It has been noted here before though, if your sample is small enough that it actually makes a difference then your sample size is too small.--RDBury (talk) 03:46, 3 September 2010 (UTC)[reply]

See the Wikipedia article on unbiased estimator, which has the explanation you're looking for. --173.49.14.153 (talk) 04:20, 3 September 2010 (UTC)[reply]

If you knew the population (actual) mean rather than estimating it and used that to get the squared differences then N would be correct. However using the sample (estimated) mean makes the sum of the squared differences slightly smaller. In fact the sum of the squared differences from the population mean is equal to the sum of the squares of the differences from the sample mean plus N times the square of the difference between the population mean and the sample mean. This itself gives you an estimate of the probable difference between the population and sample mean so the workings out in the article is just using this to get an estimate of the sum of squared differences from the population mean. A finickety point is that it is only the expression without the square root that is unbiased, the estimated standard deviation from taking the square root is biased but I would worry even less about that than using N instead of N-1 in the denominator. Dmcq (talk) 07:57, 3 September 2010 (UTC)[reply]

Maybe it won't hurt to mention also that unbiasedness may be slightly over-rated, at least by non-statisticians. See my paper on this: "An Illuminating Counterexample", American Mathematical Monthly, Vol. 110, No. 3 (March, 2003), pp. 234–238. Michael Hardy (talk) 18:47, 4 September 2010 (UTC)[reply]

Random variables

Hello mathematicians! Can you please help me solve this. It's not homework, it's actually work work. Say $S$ is the amount of money I make per "event" and $E$ is the number of events per year. Let's also say that $S$ has a lognormal distribution and $E$ is a poisson distribution (the parameters for $S$ can be estimated from some data and let's assume that the parameter for $E$ is known).

A) Then the total money I make from these events in one year is $P=SE$ . Is there an analytic distribution function for $P$ ?

B) Will the following monte-carlo methods work to determine a distribution for $P$ :

1) sample a random value from

E

, say

e

, then sample

e

values of

S

and add them up - repeat this many times; or

2) sample a random value from

E

, say

e

, and sample a random value of

S

, say

s

, and then use

es

- and repeat this many times.

What is the difference between these two methods? What other possible numerical methods can I use to determine $P$ ? Thanks very much. --Mudupie (talk) 17:32, 3 September 2010 (UTC)[reply]

I'll assume that the events don't all make the same amount of money, but rather that each makes an independent contribution drawn from some distribution. Then

P\neq SE

. In fact there isn't even an S, there are iid random variables

S_{1},\ S_{2},\ \cdots ,S_{E}

, and

P=\sum _{i}S_{i}\;\!

. So it's clear that you can't sample the distribution of P with method 2 - you'll get a different distribution which has a much higher variance. You can use method 1, though.

You may know that if X and Y are iid then

\mathbb {V} [X+Y]=\mathbb {V} [X]+\mathbb {V} [Y]=2\mathbb {V} [X]

while

\mathbb {V} [X+X]=\mathbb {V} [2X]=4\mathbb {V} [X]

. If it seems that E being random makes a difference, think what happens when

\lambda

is large - then E is roughly constant.

If finding the expectation and variance of the distribution suffices, you have

\mathbb {E} [P]=\mathbb {E} [E]\mathbb {E} [S]

, and if I'm not mistaken

\mathbb {V} [P]=\mathbb {E} [E^{2}]\mathbb {E} [S]+\mathbb {E} [E]\mathbb {V} [S]-\mathbb {E} [E]^{2}\mathbb {E} [S]^{2}

. This holds no matter what are the distributions of E and S, as long as everything is independent. -- Meni Rosenfeld (talk) 18:56, 4 September 2010 (UTC)[reply]

Thanks very much Meni! That was very useful information. I have one follow up question for now. I'm trying to understand how to derive the expectation of P. I guess the following equation holds but I don't understand why: $\mathbb {E} [S_{1}+...+S_{E}]=\mathbb {E} [S_{1}+...+S_{\lambda }]$ . I "get" that it makes sense but I don't know the actual theoretic reason. Can you please explain? --Mudupie (talk) 23:09, 4 September 2010 (UTC)[reply]

Formula images

In every maths page on wikipedia I notice the formulae are images not text. How do you create these? On Mac? Thanks for any replies.86.147.12.111 (talk) 18:05, 3 September 2010 (UTC)[reply]

See Help:Displaying a formula. —Bkell (talk) 18:27, 3 September 2010 (UTC)[reply]

Thank you86.147.12.111 (talk) 19:42, 3 September 2010 (UTC)[reply]

Also, when you see a page with such formulas, if you click on "edit", you'll see how they are created. Michael Hardy (talk) 18:51, 4 September 2010 (UTC)[reply]

Homogeneous polynomials

The symmetric degree 4 homogeneous polynomial in two variables: x⁴ + x³y + x²y² + xy³ + y⁴ can be written (x⁵−y⁵)(x−y)⁻¹ for x≠y. What is the analogous expression for the symmetric degree 4 homogeneous polynomial in 3 variables: x⁴ + x³y + x³z + x²y² + x²yz + x²z² + xy³ + xy²z + xyz² + xz³ + y⁴ + y³z + y²z² + yz³ + z⁴ ? Bo Jacoby (talk) 22:28, 3 September 2010 (UTC).[reply]

First, just to be consistent with the terminology, these are called the complete homogeneous symmetric polynomials. The expression you're looking for follows from the properties of Schur polynomials.

s_{4}(x,y,z)={\frac {1}{\Delta }}\;\det \left[{\begin{matrix}x^{6}&y^{6}&z^{6}\\x&y&z\\1&1&1\end{matrix}}\right]

which turns out to be the complete symmetric polynomial. Here Δ is the product of the differences (x−y)(x−z)(y−z).--RDBury (talk) 04:33, 4 September 2010 (UTC)[reply]

Thank you very much! Bo Jacoby (talk) 06:10, 4 September 2010 (UTC).[reply]

September 4

A better Parametric test for a sine-wave distribution

Lunar cycle theorists postulate cyclic fish activity associated with major and minor lunar periods.  This creates a theoretical sine-wave centered around the average catch/activity rate.  Usually this is illustrated by sine-wave-like imagery with peaks at the top and the negative wave-half flipped as data for both majors and minor and in-between hours  can only be positive integer values.

I have a large data set of fish catches, over 10,000 hours, with most catches 0 or 1 per hour but ranging up to a very few catches of 9 or 10 per hour. I believe the data to be parametric. Using one way AOV (Statistix 9) I have a P small enough to declare my overall data collection highly significant by scientific standards. However, breakdowns of the data and subsets by type of water fished and fish size show several dependent variables effect results and produce less than significant data sets using AOV. However, these sub-sets still show peaks near the major and minor hours as predicted by the lunar cycle postulates.

Is/are there statistical method/methods that focus on this tendency to have peaks in the right places rather than just seeking sufficient differences between means and SDs? Please provide enough detail so that I might apply your input.

Also please comment if you feel my use of oneway AOV is inappropriate for this type of data. I'm long out of touch with academic sources of statistical guidance.

19:19, 4 September 2010 (UTC)~ —Preceding unsigned comment added by RManns (talk • contribs)

Need a better parametric test for a sine-wave distribution.

Lunar cycle theorists postulate cyclic fish activity associated with major and minor lunar periods. This creates a theoretical sine-wave centered around the average catch/activity rate. Usually this is illustrated by sine-wave-like imagery with peaks at the top and the negative wave-half flipped as data for both majors and minor and in-between hours can only be positive integer values.

I have a large data set of fish catches, over 10,000 angling hours, with the majority of catches 0 or 1 per hour but a few hours ranging up to a few catches of 9 or 10 per hour. I believe the data to be parametric. Using one way AOV (Statistix 9) I have a P small enough to declare my overall data collection highly significant by scientific standards. However, breakdowns of the data and subsets by type of water fished and fish size show several dependent variables effect results and produce less than significant data sets using AOV. However, these sub-sets still show peaks near the major and minor hours as predicted by the lunar cycle postulates.

Is/are there statistical method/methods that focus on this tendency to have peaks in the right places rather than just seeking sufficient differences between means and SDs? Please provide enough detail so that I might apply your input.

Also please comment if you feel my use of oneway AOV is inappropriate for this type of data. I'm long out of touch with academic sources of statistical guidance.

19:24, 4 September 2010 (UTC)~ —Preceding unsigned comment added by RManns (talk • contribs)

When to use Kendall / Spearman correlations instead of Pearson's?

Statistical software package I use offers, in addition to Pearson product-moment correlation coefficient, also Spearman's rank correlation coefficient and Kendall tau rank correlation coefficient. I am trying to find an explanation of why one would want (or when one can...) use S/K instead of P. I found a bunch of descriptions, little different from our wiki pages - they go into math; but I don't care about the theory as much as for application (when to use which).

I am guessing that there are times you want to use P, and times you want to use S/K, and if you use incorrect one you'll get a misleading result (It seems that for P, both variables should be normally distributed (how can I test if this is true?). S/K do not have this assumption (doesn't it make them better by default...?)). How to determine which one do you want to use?

In particular, I am looking at some data that seems not significant under P, but more so under S/K. What does that mean? Is the correlation in the data I am looking at statistically significant or not? --_{Piotr Konieczny aka Prokonsul Piotrus| talk} 21:06, 4 September 2010 (UTC)[reply]

Well, in short, Pearson correlation coefficient tells us if there is a linear relationship between two variables and the other two tell us if there is a monotonous relationship between those variables. Thus low Pearson correlation coefficient with high Spearman or Kendal tau correlation coefficient indicate that there might be a monotonous relationship between two variables, but it is not linear. For example, it could be that one variable is proportional to the the cube of the other. You might wish to look at a scatter graph to find out more. --Martynas Patasius (talk) 22:10, 4 September 2010 (UTC)[reply]

What is the test of statistical significance for nominal variables?

Let's say I want to see if a nominal variable (country, or certain groupings of ones) and ratio one (geographical or population size, for example) are statistically significant. What test should I use? Correlation is out, because it is not useful for categorical variables, right...? --_{Piotr Konieczny aka Prokonsul Piotrus| talk} 21:39, 4 September 2010 (UTC)[reply]

^ Proof at http://pages.pacificcoast.net/~cazelais/252/lc-trig.pdf

[1] Proof at http://pages.pacificcoast.net/~cazelais/252/lc-trig.pdf

[1]

@@ Line 168: / Line 168: @@
 :You may know that if ''X'' and ''Y'' are iid then <math>\mathbb{V}[X+Y]=\mathbb{V}[X]+\mathbb{V}[Y]=2\mathbb{V}[X]</math> while <math>\mathbb{V}[X+X]=\mathbb{V}[2X]=4\mathbb{V}[X]</math>. If it seems that ''E'' being random makes a difference, think what happens when <math>\lambda</math> is large - then ''E'' is roughly constant.
 :If finding the expectation and variance of the distribution suffices, you have <math>\mathbb{E}[P]=\mathbb{E}[E]\mathbb{E}[S]</math>, and if I'm not mistaken <math>\mathbb{V}[P] = \mathbb{E}[E^2]\mathbb{E}[S]+\mathbb{E}[E]\mathbb{V}[S]-\mathbb{E}[E]^2\mathbb{E}[S]^2</math>. This holds no matter what are the distributions of ''E'' and ''S'', as long as everything is independent. -- [[User:Meni Rosenfeld|Meni Rosenfeld]] ([[User Talk:Meni Rosenfeld|talk]]) 18:56, 4 September 2010 (UTC)
+Thanks very much Meni! That was very useful information. I have one follow up question for now. I'm trying to understand how to derive the expectation of ''P''. I guess the following equation holds but I don't understand why: <math>\mathbb{E}[S_1 + ... + S_E] = \mathbb{E}[S_1 + ... + S_\lambda]</math>. I "get" that it makes sense but I don't know the actual theoretic reason. Can you please explain? --[[User:Mudupie|Mudupie]] ([[User talk:Mudupie|talk]]) 23:09, 4 September 2010 (UTC)
 == Formula images ==