Content deleted Content added

Inline

Revision as of 11:07, 30 October 2012

Welcome to the mathematics section
of the Wikipedia reference desk.

skip to bottom

Select a section:

Shortcut

WP:RD/MA

Want a faster answer?

Main page: Help searching Wikipedia

How can I get my question answered?

Select the section of the desk that best fits the general topic of your question (see the navigation column to the right).
Post your question to only one section, providing a short header that gives the topic of your question.
Type '~~~~' (that is, four tilde characters) at the end – this signs and dates your contribution so we know who wrote what and when.
Don't post personal contact information – it will be removed. Any answers will be provided here.
Please be as specific as possible, and include all relevant context – the usefulness of answers may depend on the context.
Note:
- We don't answer (and may remove) questions that require medical diagnosis or legal advice.
- We don't answer requests for opinions, predictions or debate.
- We don't do your homework for you, though we'll help you past the stuck point.
- We don't conduct original research or provide a free source of ideas, but we'll help you find information you need.

Ready? Ask a new question!

How do I answer a question?

Main page: Wikipedia:Reference desk/Guidelines

The best answers address the question directly, and back up facts with wikilinks and links to sources. Do not edit others' comments and do not give any medical or legal advice.

October 22

What is the relationship between the order of a conjugacy and the order of elements in that conjugacy class?

A group of order 21 contains a conjugacy class C(x) of order 3. What is the order of x. --150.203.114.14 (talk) 14:30, 22 October 2012 (UTC)[reply]

The size of the conjugacy class of x equals the index [G:C_G(x)] of its centralizer. You can easily determine the order of x from this fact.—Emil J. 16:05, 22 October 2012 (UTC)[reply]

So, 7 then. Thanks. This question may be a bit trickier: the class equation of a group (order 20) is 1+4+5+5+5. Does this group have a subgroup of order 5? If so, is it normal? --130.56.90.212 (talk) 00:06, 23 October 2012 (UTC)[reply]

enumerating a group

I've heard that the Todd–Coxeter algorithm has the useful side-effect of enumerating a group defined by a set of generators. Unfortunately I can't make sense of that description of the algo. Can someone point to a clearer description, or sample code (say in Python or pseudocode)? —Tamfang (talk) 19:08, 22 October 2012 (UTC)[reply]

October 23

Generators of a ring of polys with a particular factor.

What are the generators of the ring of real polynomials divisible by $x^{2}-4x+5$ ? Widener (talk) 03:09, 23 October 2012 (UTC)[reply]

I take it you are allowing your rings to be non-unital. I think the polynomials divisble by

x^{2}-4x+5\,

will be generated by

x^{2}-4x+5\,

and

x(x^{2}-4x+5)\,

, since

x^{n+2}(x^{2}-4x+5)=((x^{2}-4x+5)x^{n}+4x^{n+1}-5x^{n})(x^{2}-4x+5)

Obviously other pairs of generators can be chosen, such as

(x+1)(x^{2}-4x+5)\,

and

(2x+1)(x^{2}-4x+5)\,

. Gandalf61 (talk) 08:39, 24 October 2012 (UTC)[reply]

A simple calculus question

If we have ${\partial ^{2}f}/{\partial x^{2}}$ and replace $x=k.t$ ,k is a constant, what is the denominator of fraction, ${k\partial t^{2}}$ or ${k^{2}\partial t^{2}}$ ? — Preceding unsigned comment added by Re444 (talk • contribs) 12:37, 23 October 2012 (UTC)[reply]

The question doesn't entirely make sense: a partial derivative is not a fraction (and hence has no denominator), and in any event substituting a variable with respect to which a partial derivative is taken cannot be done without simultaneously specifying what the other independent variables (that are being held constant) are replaced by. However, if we look at the standard derivative instead, we have that if

x = kt

with

k

constant throughout the domain of interest, then

{\frac {d^{2}f}{dx^{2}}}={\frac {1}{k^{2}}}\cdot {\frac {d^{2}f}{dt^{2}}}.

— Quondum 13:10, 23 October 2012 (UTC)[reply]

October 24

October 25

October 26

Cylindrical sections

A cylindrical section must be an ellipse or a circle (unless the intersecting plane is vertical). Is there a way to express the eccentricity of this ellipse in terms of the angle of the intersecting plane from the horizontal, and how would it be derived? Double sharp (talk) 10:57, 26 October 2012 (UTC)[reply]

It the radius of the cylinder is

r

and the angle is

\theta

then the length of the Semi-major axis will be

{\frac {r}{\cos \theta }}

. The eccentricity of an ellipse is

{\sqrt {1-{\frac {b^{2}}{a^{2}}}}}={\sqrt {1-{\frac {r^{2}}{\frac {r^{2}}{\cos ^{2}\theta }}}}}=\left|\sin \theta \right|

--Salix (talk): 12:06, 26 October 2012 (UTC)[reply]

Thanks! That's what I thought it would be. Double sharp (talk) 14:23, 26 October 2012 (UTC)[reply]

October 27

Pi or Tau

whats better, pi ( Circumference/Diameter ) or tau (Circumference/radius)?203.112.82.2 (talk) 00:27, 27 October 2012 (UTC)[reply]

Personally, I like pi's positions on the economy, but tau's stance on defense is better. In all seriousness though... "better" in terms of what? You'll have to be significantly less vague. --Kinu ^t/_c 01:49, 27 October 2012 (UTC)[reply]

The discussion is between [1] and [2]. Personally I think that tau is a much better choice. Bo Jacoby (talk) 03:41, 27 October 2012 (UTC).[reply]

Or from a different perspective: had academic inertia been on the other side (i.e. if τ had been the one to become mainstream instead of π), it is difficult to imagine a movement suggesting that π = τ/2 should replace τ occurring. — Quondum 05:46, 27 October 2012 (UTC)[reply]

Both of them are rather good in making puns (provided you pronounce τ using the ancient Greek pronounciation [taw], and not the modern [taf], of course). Double sharp (talk) 07:59, 27 October 2012 (UTC)[reply]

I like pi/3 because it's close to 1. Count Iblis (talk) 16:49, 27 October 2012 (UTC)[reply]

Tau is better.

In response to the pi manifesto:

The area of a sector is ${\tfrac {1}{2}}\theta r^{2}$ , which itself follows from the area of a triangle being ${\tfrac {1}{2}}ah$ . As such the natural way to express the area of a disc is ${\tfrac {1}{2}}\tau r^{2}$ (as it is a sector with angle tau) rather than $\pi r^{2}$ .
The standard normal distribution should of course be then one with a variance of 1. The half in the exponent is just there; and then you may as well group the 2 with the pi.

Of course, I wouldn't actually use tau after years of familiarity with both the number pi and the letter used to represent it. -- Meni Rosenfeld (talk) 20:01, 27 October 2012 (UTC)[reply]

How familiar with the number pi do you claim to be, Meni? Her full panoply of digital secrets has never yet been revealed, although many so-called mathematicians amuse themselves with the idea that computing a zillion quadrillion digits is somehow getting them closer to the Whole Truth, or even at least showing them some interesting scenery along the Road to Nowhere. It doesn't even do that, unfortunately. She knows all there is to know about playing hard to get. -- Jack of Oz ^[Talk] 20:27, 27 October 2012 (UTC)[reply]

The number pi is a very different thing from the decimal digits of pi. The decimal digits (even if you have all

\aleph _{0}

of them as a completed infinite totality) are a fairly inessential surface representation of the underlying Platonic reality. --Trovatore (talk) 22:44, 27 October 2012 (UTC)[reply]

Well I did memorize 90 decimal places 15 years ago (and yes, still remembering them). It's a bond that can never be broken. -- Meni Rosenfeld (talk) 20:40, 27 October 2012 (UTC)[reply]

I for one don't see why using tau would be that much better than using pi. Either would occur together with small rational factors in formulas just as frequently. If you like appeal to authority, let me note how Ken Iverson, who is famous for innovations in mathematical notation, appears to have preferred pi times over tau times. – b_jonas 11:33, 28 October 2012 (UTC)[reply]

I must link to xkcd comic strip 576: Urgent Mission. – b_jonas 11:54, 28 October 2012 (UTC)[reply]

Identifying Rings

I am asked to identify the ring $\mathbb {Z} [x]/(x^{2}-3,2x+4)$ . What does this mean? Does it simply mean "the integers adjoined with ${\sqrt {3}}$ ", for exemple? Or is it isomorphic perhaps to a more well known group (for example, the complex numbers are isomorphic to $\mathbb {R} [x]/(x^{2}+1)$ ) (but I don't think it's isomorphic to the algebraic numbers). --AnalysisAlgebra (talk) 10:40, 27 October 2012 (UTC)[reply]

It doesn't seem like a very well posed question, but I guess they just mean find some simpler looking, or more familiar ring that this one is isomorphic to. It turns out these relations kill quite a lot of stuff and you end up with something pretty small. Rckrone (talk) 20:17, 27 October 2012 (UTC)[reply]

Oh well. Thank you vor your help anyway. — Preceding unsigned comment added by AnalysisAlgebra (talk • contribs) 00:56, 28 October 2012 (UTC)[reply]

I presume the notation

\mathbb {Z} [x]/(x^{2}-3,2x+4)

implies that the relations

x^{2}-3\sim 0

and

2x+4\sim 0

hold simultaneously. Adding these, this would imply that

x^{2}+1\sim 0

. I'm guessing that

\mathbb {Z} [x]/(x^{2}+1)

isomorphic to the Gaussian integers. The relation

2x+4\sim 0

would then restrict these to a very small finite ring, as Rckrone suggests. — Quondum 07:33, 28 October 2012 (UTC)[reply]

Where did you get

x^{2}+1\sim 0

? Are you sure you don't mean

(x+1)^{2}\sim 0

?

R[x]/(P(x))

is isomorphic to the smallest ring containing

R

and elements satisfying

P(x)=0

. So

\mathbb {R} [x]/(x^{2}+1)

is isomorphic to the smallest ring containing

\mathbb {R}

,

i

, and

-i

, which is basically all possible sums and products of those elements, and is

\mathbb {C}

.--AnalysisAlgebra (talk) 09:44, 28 October 2012 (UTC)[reply]

Oops, yes:

(x+1)^{2}\sim 0

. (I understand that you get to ℂ from ℝ, but you're starting from ℤ in this instance.) I'd follow the the equivalences, but I'm on unsure ground here. I seem to end up with a rather small ring. — Quondum 11:10, 28 October 2012 (UTC)[reply]

Functions

Spivak states that "in almost every branch of mathematics, functions turn out to be the central objects of investigation", or something similar. To what extent is this true? What are the most important exceptions? (License for personal opinion is clear, I hope.) —Anonymous Dissident^Talk 14:00, 27 October 2012 (UTC)[reply]

The number of branches of mathematics is not constant. Nor is it well defined which objects of investigation are "central". The statement is subjective and unprecise and the extent of truth cannot be established. The obvious exceptions are of course the branches that predates the concept of "function". Bo Jacoby (talk) 19:18, 27 October 2012 (UTC).[reply]

The statement would ring truer IMO if it read "in almost every branch of mathematics, functions turn out to be central to the investigation of objects". — Quondum 11:21, 28 October 2012 (UTC)[reply]

Yes, this sounds even more true. --pm a 18:55, 28 October 2012 (UTC)[reply]

World series

In the baseball World Series, the winner of 4 of the 7 games wins the series. What are the chances, given that team A has won the first 3, that team B will win the last 4 games, and thus the series ?

Obviously, you can go off the historic record, but it only goes back a century or so, and changes in the rules might make old World Series less useful for statistics. Incidentally, I believe this 3 then 4 pattern happened only once, in 2004.

But I'm more interested in the Bayesian probability approach. Is this a reasonable way to approach the problem ?

A) Assume each game to be an independent event. That is, whatever team A's probability of winning the first game, this is the same probability they will win each of the remaining games. So, no "momentum", considering where the games are played, etc.

B) For the first trial, assume a 1/2 probability of winning. Thus, we get a 1/128 probability for team A winning the first 3 and team B winning the last 4 and the series. (If we allow for the possibility of reversing who wins the 3 games and who wins 4, this doubles the probability to 1/64, but I'm not concerned with that.)

C) However, if we assume that the team which will win 4 games is better, say winning 2/3 of their games, we then get (1/3)³(2/3)⁴ or 16/2187 or 1/136.6875. Looks like 2/3 is going too far.

D) Therefore, should we assume the winning team has a 4/7 chance of winning, since that will be their record in the World Series, if this situation occurs ? This gives a probability of (3/7)³(4/7)⁴ or 6912/823543 or about 1/119.146846.

So, is that the best answer ? StuRat (talk) 22:58, 27 October 2012 (UTC)[reply]

No. To do it in a Bayesian way, you don't just need to assume a prior probability for A winning, you need to assume a probability distribution for the probability of A winning. Once you have that, you can update it based on observing that A has won 3 times.

You can arrive at a suitable distribution by looking at all statistics of the game, not just for A and B.

A popular choice for a prior distribution for the probability of an event is the beta distribution, because it is a conjugate prior for this problem - that is, once you collect data, the updated distribution is again beta. The highest-entropy choice is

\alpha =\beta =1

, which gives a uniform distribution; but that will not likely be adequate for baseball.

Don't forget that you asked for the probability of B winning the last 4 games given that A won the first 3. This means you don't multiply by the probability of A winning the first 3. -- Meni Rosenfeld (talk) 06:29, 28 October 2012 (UTC)[reply]

Oops, I meant the total probability. So, how does this work out ? StuRat (talk) 19:22, 29 October 2012 (UTC)[reply]

Prior to playing 7 games there are 8 equally probable outcomes: team A wins K times and looses 7-K times, where 0≤K≤7. After 3 games team A won k=3 times and lost 3-k=0 times. This can happen in ${\binom {K}{k}}{\binom {7-K}{3-k}}={\binom {K}{3}}$ ways. The requested Bayesian probability is $\Pr(K=3)={\frac {\binom {3}{3}}{\sum _{K=0}^{7}{\binom {K}{3}}}}={\frac {\binom {3}{3}}{\binom {8}{4}}}={\frac {1}{70}}.$ Bo Jacoby (talk) 09:50, 30 October 2012 (UTC).[reply]

[ec]I think the only interpretation of this that makes sense is, "given that A has won 3 times, and that A and B go on to play 7 more games, what is the probability of A winning the first 3 of these 7 games and B winning the last 4?". Then the calculation is as follows:

Start with a prior distribution for p, the probability of A winning, typically Beta with parameters $\alpha ,\ \beta$ . (It is assumed that p itself is fixed, the teams don't get any better or worse throughout the competition).
After observing 3 wins by A, update your distribution to Beta with $\alpha +3,\ \beta$ .
Given p, the probability for A winning the first 3 of the next 7 matches and B winning the last 4 is $p^{3}(1-p)^{4}$ .
So given your distribution for p, which is $P(p)\propto p^{\alpha +2}(1-p)^{\beta -1}$ , the probability of this happening is:

{\frac {\int _{0}^{1}p^{3}(1-p)^{4}p^{\alpha +2}(1-p)^{\beta -1}\ dp}{\int _{0}^{1}p^{\alpha +2}(1-p)^{\beta -1}\ dp}}={\frac {(\alpha +5)!(\beta +3)!(\alpha +\beta +2)!}{(\alpha +2)!(\beta -1)!(\alpha +\beta +9)!}}

For a uniform prior $\alpha =\beta =1$ this gives ${\tfrac {2}{1155}}$ . For a somewhat more realistic $\alpha =\beta =10$ this gives ${\tfrac {143}{24012}}$ . -- Meni Rosenfeld (talk) 09:56, 30 October 2012 (UTC)[reply]

Bo, that's not how probability works; and you basically lost all credibility by saying that all 8 outcomes start out equally probable. (When tossing a coin 100 times, is 50 heads as probable as no heads?) -- Meni Rosenfeld (talk) 09:59, 30 October 2012 (UTC)[reply]

October 28

Zero ring equivalence

Given a ring $R$ and an element $a\in R$ , show that $R[x]/(ax-1)$ is the zero ring if and only if $a$ is nilpotent.--AnalysisAlgebra (talk) 02:06, 28 October 2012 (UTC)[reply]

You're basically introducing a new element $a^{-1}$ aren't you? This means $a^{n}a^{-n}=1$ for all $n$ . However, if $a$ is nilpotent then there exists $n$ such that $a^{n}=0$ . Since anything multiplied by $0$ is $0$ , this means $1=a^{n}a^{-n}=0a^{-n}=0$ . $1=0$ is a sufficient condition for the ring being the zero ring!--AnalysisAlgebra (talk) 04:11, 28 October 2012 (UTC)[reply]

It's a necessary condition too isn't it? Does that prove the "only if" part in the "if and only if"?--AnalysisAlgebra (talk) 04:13, 28 October 2012 (UTC)[reply]

I'm not strong in this area, but I can see a few problems with your approach. The polynomial ring R[x] (and hence the quotient ring R[x]/(ax − 1)) is very different from the ring R. And you are not introducing an element a⁻¹; it either already exists in R or it doesn't. Another point: be clear whether by "zero ring" is meant a zero ring (a ring in which every product is zero) or the trivial ring (the ring with one element). — Quondum 06:58, 28 October 2012 (UTC)[reply]

AnalysisAlgebra means

x

in

R[x]/(ax-1)

will be "a new element

a^{-1}

", and if you want, you can replace all his

a^{-1}

s with

x

s; his proof is correct. But it doesn't show the "only if". For that, consider what it means for the quotient ring to be the trivial ring (which is what is meant). What would have to be in the ideal you are quotienting by? And what would that imply about

a

, after a little work? John Z (talk) 09:40, 28 October 2012 (UTC)[reply]

That would imply

R[x]=(ax-1)

wouldn't it?--AnalysisAlgebra (talk) 10:49, 28 October 2012 (UTC)[reply]

I think for every polynomial

P

,

P(x)(ax-1)\in (ax-1)

. How do you now exploit

a\in R=(ax-1)

which is the only other information you have?--AnalysisAlgebra (talk) 10:52, 28 October 2012 (UTC)[reply]

In particular,

1\in (ax-1)

, so

P(x)(ax-1)=1

for some polynomial

P

. Now analyse the coefficients of

P

.--80.109.106.49 (talk) 11:03, 28 October 2012 (UTC)[reply]

I'VE GOT IT! In that case, you must have

P(x)=-\sum _{k=0}^{\infty }a^{k}x^{k}

and this is a finite polynomial only if

a

is nilpotent! M-M-M-M-MONSTER KILL!!!--AnalysisAlgebra (talk) 11:52, 28 October 2012 (UTC)[reply]

Oh and thanks for all your help by the way.--AnalysisAlgebra (talk) 11:54, 28 October 2012 (UTC)[reply]

Hey, wait a minute - why is

1\in (ax-1)

?--AnalysisAlgebra (talk) 12:10, 28 October 2012 (UTC)[reply]

Because

R[x]=(ax-1)

, as you said above, and

1\in R[x]

. — Preceding unsigned comment added by 80.109.106.49 (talk) 17:23, 28 October 2012 (UTC)[reply]

Bayesian estimation of variance vs statistical unbiasedness

I have recently been thinking about the estimation of an unknown variance from a Normal distribution.

As is well known, maximum likelihood gives an estimate

{\hat {\sigma ^{2}}}_{ML}={\frac {S^{2}}{n}}

where S² = Σ (x_i - μ)² and n is the number of observed data points

This is unbiased in the statistical sense -- i.e. for any true value of σ², the expected value of the estimator is also σ² -- if the mean of the underlying distribution is known

If the mean is not known, it is standard to apply Bessel's correction, to give

{\hat {\sigma ^{2}}}_{pop}={\frac {S^{2}}{n-1}}

to give an unbiased estimator, as derived as an example in the Bias of an estimator article.

If it is an unbiased estimation of standard deviation we're interested in, given unknown mean, there is a slightly different correction that can be made,

{\hat {\sigma }}_{pop}\simeq {\sqrt {{\frac {1}{n-1.5}}\sum _{i=1}^{n}(x_{i}-{\bar {x}})^{2}}}

though in practice pretty much nobody bothers with this, and just uses the square root of the unbiased estimator of population variance instead.

So far, so standard. But then I turned to Bayesian estimation of σ² and σ

A very standard prior to use is the Jeffreys prior for the problem, which places an (improper, but that may not be a problem) rescaling-invariant flat distribution on ln σ, $\scriptstyle {p(\ln \sigma |I)\;\propto \;1}$ , which corresponds to a distribution $\scriptstyle {p(\sigma ^{2}|I)\;\propto \;1/\sigma ^{2}}$ , or $\scriptstyle {p(\sigma |I)\;\propto \;1/\sigma }$ .

If the mean μ is known, this leads to a posterior probability for σ^2 given by an scaled inverse chi-squared distribution, giving a posterior expectation value of

\langle {\sigma ^{2}}_{Bayes}\rangle ={\frac {S^{2}}{n-2}}

If the mean is not known, marginalising over the joint distribution of μ and σ² gives another scaled inverse chi-squared distribution, with a posterior expectation value for σ² of

\langle {\sigma ^{2}}_{pop-Bayes}\rangle ={\frac {S^{2}}{n-3}}

The posterior probability for σ follows an scaled inverse chi distribution, which by comparison with the more general generalized gamma distribution we can find has a posterior expectation value for μ known of

\langle {\sigma }_{Bayes}\rangle ={\frac {S}{\sqrt {2}}}\;{\frac {\Gamma ((n-1)/2)}{\Gamma (n/2)}}

which, for all but the smallest n, converges rapidly towards

\langle {\sigma }_{Bayes}\rangle \simeq {\sqrt {\frac {S^{2}}{n-1.5}}}

the same formula as previously found, but for the unbiased frequentist population estimator -- ie the frequentist estimator with unknown mean.

For μ not known in the Bayesian case, we get

{\begin{aligned}\langle {\sigma }_{pop-Bayes}\rangle =&{\frac {S}{\sqrt {2}}}\;{\frac {\Gamma ((n-2)/2)}{\Gamma ((n-1)/2)}}\\\simeq &{\sqrt {\frac {S^{2}}{n-2.5}}}\end{aligned}}

The Bayesian posterior expectation values for σ² and for σ, both for μ known and μ unknown, are therefore quite different to the Frequentist "unbiased" estimators.

Question: Is it cause for concern that the Bayesian estimators are not unbiased in the Frequentist sense (even given a supposedly uninformative prior)?

Is this just par for the course for a Bayesian inference, which in general cares little about such things? Or does it suggest the prior is not all it should be? (Or have I got my sums wrong?) Jheald (talk) 09:18, 28 October 2012 (UTC)[reply]

I can confirm your results for the case of known mean.

I don't think it's a cause for concern. The Jeffreys prior is just too wide and wild, the variance estimator doesn't even have a mean with 2 or less datapoints. Expecting its mean to be equal the parameter with more datapoints is just too much. (The same could be said for the mean estimator, but for it the prior is at least uniform in linear scale.)

In practice we usually know something (however faint) a priori, so a non-informative prior is not necessarily the best choice. I think the frequentist approach implicitly assumes tame behavior of the parameter. -- Meni Rosenfeld (talk) 13:57, 28 October 2012 (UTC)[reply]

I don't think it's right to says that "the frequentist approach implicitly assumes tame behavior of the parameter" -- the frequentist approach is intended to work whatever the value of the parameter, from epsilon above zero right through to 1/epsilon.

On the other hand you're right that many practising Bayesians, particularly if they perhaps are going to be doing automated Bayes factor model comparison, will try to do all they can to make their priors as realistic as possible, throwing in everything (they think) they know about the problem. And of course, once you've moved to informative priors then the game is very different -- the Bayesian isn't interested in any "unbiasedness" that doesn't take any of that additional information into account; what the Bayesian is interested in is building the best book of odds that they can, so that if an unbiased frequentist does wander into their betting shop, they think they will take them to the cleaners.

But if we exclude that additional information, and stay with a prior that is at least supposed to be uninformative, I still find it very striking that the Bayesian can apparently choose an expectation value for σ^2 that will be higher than the Frequentist's "unbiased" estimator in every case -- and expect to be consistently nearer, in a least squares sense, to the truth. Is this correct? Is the Frequentist approach of trying to make

\scriptstyle {E({\hat {\theta }}|\theta )=\theta }

profoundly misconceived, compared to the Bayesian choosing

\scriptstyle {\hat {\theta }}

such that

\scriptstyle {E(\theta |{\hat {\theta }})={\hat {\theta }}}

? And yet, as your response above indicates, we do tend to assume there is something right about the Frequentist objective, and

\scriptstyle {{\hat {\sigma }}_{N-1}}

; even when a principled Bayesian calculation leads to a higher estimate for every single dataset.

One method or the other, it seems to me, has to face off worse out of this. Jheald (talk) 15:56, 28 October 2012 (UTC)[reply]

There are also some Bayesian books I have had a quick look at, and there does seem to perhaps be a reluctance to "call out" the

\scriptstyle {{\hat {\sigma }}_{N-1}}

estimator, or shine too bright a light on it. (I'll come back and edit in page-refs later; and apologies if I've missed something in quite a cursory flick through).

David MacKay (2003), Information Theory, Inference, and Learning Algorithms finesses the issue (p. 320) by considering the distribution for p(ln σ), which he shows has its modal value at σ = sqrt(S²/N) for the case where μ is known, and at σ = sqrt(S²/N-1) for the distribution obtained after marginalising out μ when μ is not known. This closely reflects the approach he took in his early work on neural nets, which grew out of similar work in the radioastronomy analysis group in Cambridge. But it's far from clear why the value which maximises p(ln σ) is the most appropriate point-estimate for σ² or σ.

E.T. Jaynes (2003), Probability Theory: the Logic of Science has a footnote that in fact S²/(N-3) is the right value, "as we will see in a later chapter". But so far I haven't found the right later chapter, so it's possible that this is another of the parts of the book that never got finalised on paper. Though he does have two really quite polemical chapters against "unbiasedness", describing the cult of it as a pathology.

Box and Tiao (1973), Bayesian Inference in Statistical Analysis consider estimation of the parameters of a normal distribution in detail, but their focus is on finding "highest probability density" (HPD) regions -- which they note will not be invariant under reparametrisation. Their recommendation is to consider the HPD region under a transformation that most nearly makes the prior flat; though I imagine others might be most concerned about loss probabilities. As far as I can see, Box and Tiao do not mention or even allude to

\scriptstyle {{\hat {\sigma }}_{N-1}}

.

Gelman et al (1995, 2e 2004), Bayesian Data Analysis also consider the estimation in some detail, giving the form both of the probabilities for uninformative priors, and of update rules for conjugate informative priors. But again, there seems to be no discussion of or allusion to a frequentist estimator like

\scriptstyle {{\hat {\sigma }}_{N-1}}

.

Finally, Harold Jeffreys (1939, 3e 1961), Theory of Probability also considers Normal distribution parameter estimation, though again what is discussed seems to be more the form of the distribution, rather than any point estimator, or comparison with a popular estimator like

\scriptstyle {{\hat {\sigma }}_{N-1}}

.

All of the above take p(σ) ∝ 1/σ for their uninformative distribution.

So does anybody know of any literature that tackles this particular issue more directly? Jheald (talk) 16:52, 28 October 2012 (UTC)[reply]

Product of ideals is NOT an ideal ?

Given two ideals $I$ and $J$ , why is their direct product $\{ab|a\in I,b\in J\}$ NOT necessarily an ideal? If you have an element in that set and an element $r$ in the ring, don't you have $(ab)r=a(br)$ which is again the product of an element $a$ in $I$ and an element $br\in J$ (since J is an ideal)? Maybe you get problems if the multiplication is not commutative but apparently this is still a problem when it is.--AnalysisAlgebra (talk) 12:33, 28 October 2012 (UTC)[reply]

OH. That's not the problem; the problem is that it is not closed under addition. WHOOPS.--AnalysisAlgebra (talk) 12:38, 28 October 2012 (UTC)[reply]

That's exactly it. I remember making the same mistake when I first encountered the idea (of products of ideals in general, not necessarily `direct' products). You do need to close under addition in the ring -- whenever I see the notation 'IJ', it usually means finite sums of something in I times something in J (apologies for stating that sloppily). Icthyos (talk) 13:28, 28 October 2012 (UTC)[reply]

Riemann series theorem - generalized

RST states that if a series converges conditionally but not absolutely, its terms can be rearranged to give a different value for the sum (and indeed any value, or even to make it diverge). Is it possible to construct a conditionally convergent series such that even merely "rearranging the brackets" will give a different value? This is what I'm talking about:
Consider a series $\sum _{k=1}^{\infty }(-1)^{k+1}b_{n}=b_{1}-b_{2}+b_{3}-b_{4}...$ (assume all b_n are nonnegative also). This series is implicitly bracketed thus:
$\sum _{k=1}^{\infty }(-1)^{k+1}b_{n}=(...(((b_{1}-b_{2})+b_{3})-b_{4})+...$
One possible rearrangement is:
$\sum _{k=1}^{\infty }(-1)^{k+1}b_{n}=(b_{1}-b_{2})+(b_{3}-b_{4})+(b_{5}-b_{6})+...$
What I'm talking about is possible for the Grandi series but Grandi's diverges; I'm looking for a convergent series. Thanks, 24.92.74.238 (talk) 16:15, 28 October 2012 (UTC)[reply]

No, it's not possible. Every partial sum for the bracketed series is also a partial sum for the original series. If the sequence of partial sums for the original series converges, then a subsequence of it must also converge to the same value. Looie496 (talk) 16:28, 28 October 2012 (UTC)[reply]

Computable analog of P/poly

Our article on P/poly says that the class "can be used to model practical algorithms with a separate expensive preprocessing phase and a fast processing phase" and that adversaries in cryptography are sometimes considered to be P/poly machines since "this also admits the possibility that adversaries can do heavy precomputation for inputs up to a certain length". However, since the advice function may be uncomputable, the class of problems in P/poly is far larger than one could ever achieve with Turing machine pre-computation even theoretically. It would seem more reasonable for this purpose to use the problems solvable by a polynomial-time machine with polynomial-sized advice, subject to the restriction that the advice also be computable as a function of the input size. Is there a standard name for this class? « Aaron Rotenberg « Talk « 19:44, 28 October 2012 (UTC)[reply]

Continuum hypothesis

Does anyone know, if;

2^{\aleph _{\alpha }}=\aleph _{\alpha +1}.

What this evaluates to?

2^{{{2}^{2}}^{.}..}.

Edit: It won't let me format that how i wanted to - basically I mean what is 2 to the power of 2 to the power of 2 to the power of 2 to the power of 2 and so on. — Preceding unsigned comment added by 86.151.16.138 (talk) 20:41, 28 October 2012 (UTC)[reply]

The question is what do you mean by that notation? Express it as a limit of finitary expressions! For example, if you mean the limit of the sequence

\aleph _{\alpha },2^{\aleph _{\alpha }},2^{2^{\aleph _{\alpha }}},2^{2^{2^{\aleph _{\alpha }}}},\ldots

then the generalized continuum hypothesis implies that this is

\aleph _{\alpha +\omega }

. JRSpriggs (talk) 06:46, 29 October 2012 (UTC)[reply]

Perhaps a more concrete statement might help. You need to remember that

\alpha

is any ordinal number. So we get the sequence by evaluating the expression for the first few ordinals:

\aleph _{0},\aleph _{1}=2^{\aleph _{0}},\aleph _{2}=2^{2^{\aleph _{0}}},\aleph _{3}=2^{2^{2^{\aleph _{0}}}},\dots

The ordinals are not countable (the subscripts become various infinities), so this list is not countable. — Quondum 08:55, 29 October 2012 (UTC)[reply]

Ah thank you for your help. I did have one other question. Is there a limit to the cardinal numbers or do they just go on forever;

\aleph _{0},\aleph _{1},\aleph _{2},\aleph _{3},\dots

— Preceding unsigned comment added by 109.153.191.44 (talk) 19:13, 29 October 2012 (UTC)[reply]

As noted above, they go on more than forever; there's a distinct aleph number for every ordinal number. For a direct proof of the infinitude of the infinite cardinals, Cantor's theorem is that (even without assuming the continuum hypothesis) the power set of every set has a larger cardinality than the set itself. Applying this repeatedly to the the natural numbers gives an ever-increasing sequence of infinite cardinals. « Aaron Rotenberg « Talk « 00:43, 30 October 2012 (UTC)[reply]

The Vatican needs to be told about this immediately. -- Jack of Oz ^[Talk] 01:00, 30 October 2012 (UTC) [reply]

October 29

Identify this ring

Is the ring $\mathbb {Z} [i]/(2+i)$ isomorphic to a more familiar ring? I have experimented and I see nothing the cosets have in common particularly. I don't think it's isomorphic to the trivial ring since, for instance, $i\in \mathbb {Z} [i]$ but $i\notin (2+i)$ , nor is it isomorphic to $\mathbb {Q}$ which is what you would get if all elements in the same coset lied on a line which passes though the origin on an argand diagram (which is plausible at first) since for instance $5$ is in the same coset as $(2+i)$ $5=(2-i)(2+i)$ . — Preceding unsigned comment added by AnalysisAlgebra (talk • contribs) 00:21, 29 October 2012 (UTC)[reply]

Yes, it's

\mathbb {Z} /5

. It's probably easier to see this if we reverse the order of quotienting (which is valid by the appropriate isomorphism theorem): your ring is

\mathbb {Z} [x]/(2+x)(x^{2}+1)=\mathbb {Z} /((-2)^{2}+1)=\mathbb {Z} /(5).

--80.109.106.49 (talk) 08:11, 29 October 2012 (UTC)[reply]

That looks pretty neat but how do you multiply ideals like that? And how did you change the

\mathbb {Z} [x]

to

\mathbb {Z}

?--AnalysisAlgebra (talk) 10:10, 29 October 2012 (UTC)[reply]

Sorry, that was me being sloppy. I meant

(\mathbb {Z} [x]/(2+x))/(x^{2}+1)

. Then

\mathbb {Z} [x]/(2+x)=\mathbb {Z}

because we're adding a variable

x

, but then setting

x=-2

.--149.148.254.70 (talk) 10:42, 29 October 2012 (UTC)[reply]

Thanks. What about

\mathbb {Z} [x]/(4x-5,2x^{2}-4)

and

\mathbb {Z} [x]/(x^{2}+3,5)

? Using the same technique I get

(\mathbb {Z} [x]/(4x-5))/(2x^{2}-4)=\mathbb {Z} [5/4]/(2(5/4)^{2}-4)=\mathbb {Z} [5/4]/(-7/8)=\mathbb {Q} /(-7/2)

and

\mathbb {Z} [i{\sqrt {3}}]/(5)

respectively. Do those make sense?--AnalysisAlgebra (talk) 12:25, 29 October 2012 (UTC)[reply]

What map is referred to?

Is there a standard map $R\rightarrow R[x]/(ax-1)$ whose kernel is the set of all elements $b$ of $R$ such that $a^{n}b=0$ for some $n>0$ ?--AnalysisAlgebra (talk) 10:08, 29 October 2012 (UTC)[reply]

How many solutions

How many solutions are there to this equation.

${\frac {1}{a}}+{\frac {2}{b}}-{\frac {3}{c}}=1$

Where a , b and c are all positive integers? 220.239.37.244 (talk) 10:11, 29 October 2012 (UTC)[reply]

Infinitely many - consider the cases a = 1 or b = 2. Gandalf61 (talk) 10:23, 29 October 2012 (UTC)[reply]

Agreed. The general rule of thumb is that, with 3 variables, you'd need 3 independent equations to define a unique solution. StuRat (talk) 19:20, 29 October 2012 (UTC)[reply]

The OP specified that the variables ranged over the positive integers. That makes it a Diophantine equation, and in general that significantly complicates the issue of figuring out how many solutions there are (in fact, the question of whether an arbitrary Diophantine equation has a solution is undecidable). --Trovatore (talk) 01:22, 30 October 2012 (UTC)[reply]

Universal cover inducing a short exact sequence

Consider the universal cover, p : E→ X, of a space X (assume all spaces are 'nice'). Does this induce some sort of short exact sequence whose kernel is the deck transformation group (i.e. the fundamental group of X), and which involves (subgroups of) the homeomorphism groups of E and X? I'm inferring from a paper I'm reading that there is such a thing. It refers to it as the 'associated short exact sequence' of the covering, so either I'm being dense, or it's a construction I'm not familiar with. Any clues? Thanks, Icthyos (talk) 12:49, 29 October 2012 (UTC)[reply]

I think I was just being dense. Any homeomorphism of X lifts to one of E, and this lift is unique up to deck transformations, giving a sequence like the one I mentioned. Right? Icthyos (talk) 16:09, 29 October 2012 (UTC)[reply]

October 30

Relationship between $IJ$ and $I\cap J$

Let $I$ and $J$ be ideals of a ring. Is there a relationship between $IJ$ and $I\cap J$ ? Other than the obvious one $IJ\subset I\cap J$ ?--AnalysisAlgebra (talk) 01:09, 30 October 2012 (UTC)[reply]

If the ring is commutative, they have the same radical:

{\sqrt {IJ}}={\sqrt {I\cap J}}

. This also means that if P is a prime ideal, then P contains IJ if and only if it contains I∩J. Rckrone (talk) 05:07, 30 October 2012 (UTC)[reply]

Canonical map

What is the canonical map from a ring $R$ to $R[x]/(ax-1)$ where $a\in R$ ?--AnalysisAlgebra (talk) 05:26, 30 October 2012 (UTC)[reply]

(p.s. and showing that its kernel is $\{b\in R|\exists n\in \mathbb {N} :a^{n}b=0\}$ would be nice too...)--AnalysisAlgebra (talk) 05:29, 30 October 2012 (UTC)[reply]

There is a natural inclusion of R in R[x]. R[x] is the ring of polynomials in x with coefficients in R. The elements of R can also be considered polynomials (with degree 0). Then there is a natural surjection from R[x] to R[x]/(ax-1) mapping a polynomial to its equivalence class mod (ax-1). The composition of these two is the canonical map from R to R[x]/(ax-1). Rckrone (talk) 05:54, 30 October 2012 (UTC)[reply]

Generalization of the Laplacian and partial derivative

Let $u(x,y)$ be the solution of the steady-state heat equation ${\frac {\partial ^{2}u}{\partial y^{2}}}+\sum _{j=1}^{d}{\frac {\partial ^{2}u}{\partial x_{j}^{2}}}=0$ with $u(x,0)=f(x)$
A generalization of repeatedly applying the Laplacian is $(-\Delta )^{a}f(x)=\int _{\mathbb {R} ^{d}}(2\pi |\xi |)^{2a}{\widehat {f}}(\xi )e^{2\pi i\xi \cdot x}d\xi$ and it is quite easy to show that this coincides with the regular Laplacian when a is a positive integer, if f is in the Schwartz space ( ${\widehat {f}}$ is the Fourier transform of $f$ ).
Show that $(-\Delta )^{k/2}f(x)=(-1)^{k}\lim _{y\rightarrow 0}{\frac {\partial ^{k}u}{\partial y^{k}}}(x,y)$ for positive integer $k$ .Widener (talk) 11:07, 30 October 2012 (UTC)[reply]

@@ Line 285: / Line 285: @@
 (p.s. and showing that its kernel is <math>\{b\in R|\exists n\in\mathbb{N} : a^nb=0\}</math> would be nice too...)--[[User:AnalysisAlgebra|AnalysisAlgebra]] ([[User talk:AnalysisAlgebra|talk]]) 05:29, 30 October 2012 (UTC)
 :There is a natural inclusion of R in R[x].  R[x] is the ring of polynomials in x with coefficients in R.  The elements of R can also be considered polynomials (with degree 0).  Then there is a natural surjection from R[x] to R[x]/(ax-1) mapping a polynomial to its equivalence class mod (ax-1).  The composition of these two is the canonical map from R to R[x]/(ax-1). [[User:Rckrone|Rckrone]] ([[User talk:Rckrone|talk]]) 05:54, 30 October 2012 (UTC)
+== Generalization of the Laplacian and partial derivative ==
+Let <math>u(x,y)</math> be the solution of the steady-state heat equation <math>\frac{\partial^2u}{\partial y^2}+\sum_{j=1}^d \frac{\partial^2u}{\partial x_j^2}=0</math> with <math>u(x,0) = f(x)</math><br />
+A generalization of repeatedly applying the Laplacian is <math>(-\Delta)^a f(x) =  \int_{\mathbb{R}^d} (2\pi|\xi|)^{2a}\widehat{f}(\xi)e^{2\pi i\xi \cdot x}d\xi</math> and it is quite easy to show that this coincides with the regular Laplacian when ''a'' is a positive integer, if ''f'' is in the Schwartz space (<math>\widehat{f}</math> is the Fourier transform of <math>f</math>).<br />Show that <math>(-\Delta)^{k/2}f(x) = (-1)^k\lim_{y \rightarrow 0} \frac{\partial^k u}{\partial y^k}(x,y)</math> for positive integer <math>k</math>.[[User:Widener|Widener]] ([[User talk:Widener|talk]]) 11:07, 30 October 2012 (UTC)