Problem 352

A place to air possible concerns or difficulties in understanding ProjectEuler problems. This forum is not meant to publish solutions. This forum is NOT meant to discuss solution methods or giving hints how a problem can be solved.
Forum rules
As your posts will be visible to the general public you
are requested to be thoughtful in not posting anything
that might explicitly give away how to solve a particular problem.

This forum is NOT meant to discuss solution methods for a problem.

In particular don't post any code fragments or results.

Don't start begging others to give partial answers to problems

Don't ask for hints how to solve a problem

Don't start a new topic for a problem if there already exists one


See also the topics:
Don't post any spoilers
Comments, questions and clarifications about PE problems.
ninguem
Posts: 6
Joined: Mon Sep 26, 2011 2:01 am

Problem 352

Post by ninguem » Sun Oct 02, 2011 7:37 pm

Hi,
I have a question about the example given in the problem 352. The problem say that, with optimal strategy, the minimum average number of tests needed for 25 sheep is 4.155452. Also, the problem suggests that rules like "Test the mix of all 25 and that will give you about 60.35% of negative (so only one test is needed) and 39.65% positive (26 tests needed, 25 plus 1 already done)" could be used. Now... The "inefficient example" dividing in 5 groups of 5 gave us an average of 7.40198008 tests to chack 25 sheep. Now using this "inefficient" strategy PLUS the strategy of first test all 25 sheep, gives us 1*0.6035 + 7.40198008*0.3965 which is about 3.5384 (less than 4.155452) tests in average. In fact dividing in some other way (which I wont tell here in respect to the disclaim of the forum) one could get even a little less than that. Would´t that be the optimal strategy for this example?

Thanks and sorry for the english...

[]'s
Image

User avatar
hk
Administrator
Posts: 10482
Joined: Sun Mar 26, 2006 9:34 am
Location: Haren, Netherlands

Re: Problem 352

Post by hk » Sun Oct 02, 2011 7:56 pm

Isn't that all in the following sentences:
Although the scheme we have just described seems to be very efficient, it can still be improved considerably (always assuming that the test is sufficiently sensitive and that there are no adverse effects caused by mixing different samples). E.g.:

•We may start by running a test on a mixture of all the 25 samples. It can be verified that in about 60.35% of the cases this test will be negative, thus no more tests will be needed. Further testing will only be required for the remaining 39.65% of the cases.
•If we know that at least one animal in a group of 5 is infected and the first 4 individual tests come out negative, there is no need to run a test on the fifth animal (we know that it must be infected).
•We can try a different number of groups / different number of animals in each group, adjusting those numbers at each level so that the total expected number of tests will be minimised.
Your task is to find the best strategy.
Image

TripleM
Posts: 382
Joined: Fri Sep 12, 2008 2:31 am

Re: Problem 352

Post by TripleM » Sun Oct 02, 2011 8:19 pm

hk - I think you misunderstood; ninguem was saying the stated optimal value of 4.155452 is not correct.

ninguem - your calculations are incorrect. If the first test of all 25 comes back positive, running the first strategy no longer results in the same average value due to the fact you no longer have the chance all sheep are negative.

User avatar
jaap
Posts: 540
Joined: Tue Mar 25, 2008 3:57 pm
Contact:

Re: Problem 352

Post by jaap » Sun Oct 02, 2011 8:28 pm

TripleM wrote:hk - I think you misunderstood; ninguem was saying the stated optimal value of 4.155452 is not correct.

ninguem - your calculations are incorrect. If the first test of all 25 comes back positive, running the first strategy no longer results in the same average value due to the fact you no longer have the chance all sheep are negative.
Also, the expression 1*0.6035 + 7.40198008*0.3965 as the expected number of tests was wrong for another reason. It should have been either 1*0.6035 + 8.40198008*0.3965 or 1 + 7.40198008*0.3965. The cost of the first test must be borne whether it is positive or not.

ninguem
Posts: 6
Joined: Mon Sep 26, 2011 2:01 am

Re: Problem 352

Post by ninguem » Sun Oct 02, 2011 8:36 pm

Thanks for the reply TripleM,

Yes, I see that there would be a difference in the average AFTER the result come out positive, but I guess that would help EVEN more to reduce the number of tests needed (in the case we need to test all 25), since that is a PLUS information, the average would be even LESS than 7.4...

Thank you!

PS.: I´m not very good at problems (as you can see hehehehe) but I like to try a lot!! And, for me, to understand the problems is as fun as to solve then %-)

[]´s
Image

ninguem
Posts: 6
Joined: Mon Sep 26, 2011 2:01 am

Re: Problem 352

Post by ninguem » Sun Oct 02, 2011 8:42 pm

Thanks also to you jaap,

Sorry, my bad... You are right about 1*0.6035 + 8.40198008*0.3965 being the correct one since the possibilities are 1 or 1+average. But... still that gives us 3.9349 (still less than 4.15). And, as I said to TripleM, it seems to me that, even the average being incorrect AFTER the first test is true, that would reduce even more the overall average ...

Anyway I´m learning a lot here! Thank you all!!

[]'s
Image

User avatar
jaap
Posts: 540
Joined: Tue Mar 25, 2008 3:57 pm
Contact:

Re: Problem 352

Post by jaap » Sun Oct 02, 2011 8:46 pm

ninguem wrote:Yes, I see that there would be a difference in the average AFTER the result come out positive, but I guess that would help EVEN more to reduce the number of tests needed (in the case we need to test all 25), since that is a PLUS information, the average would be even LESS than 7.4...
If the first test (with all 25 samples) came out positive, then that will make the strategy for testing the groups of 5 less efficient. This is because the probability of having all 5 in a batch negative is smaller (you know there are positive ones amongst the 25), so you are more likely to need to test all 5 individually. Therefore the strategy that previously had an average of 7.4 tests in the general case will now (applied only to cases with at least one positive in the 25) have a higher average.

ninguem
Posts: 6
Joined: Mon Sep 26, 2011 2:01 am

Re: Problem 352

Post by ninguem » Sun Oct 02, 2011 8:56 pm

There is is!!

Right jaar, you are right!!! Thank you so much! %-)

Combining that answer with the one by TripleM and hk, I understood the problem!!! Thank you all for the replies!

As I said, I wont try this problem anymore because of those informations I got here (to be fair) my rule is to solve the problems without any help whatsoever (no forums, no books, no consulting, noting...) All the problems I solve up to now (starting from February 2011) I did using formulas derived by myself and the 1 min rule.

Thank you again for the information guys. It is aways awesome to learn from the more experienced !! Thanks!
Image

amcalde
Posts: 2
Joined: Thu Oct 06, 2011 3:41 am

Re: Problem 352

Post by amcalde » Thu Oct 06, 2011 3:47 am

My code makes sense to me and agrees with as far as I can calculate, but does not agree with the 25-sheep value given. My values are WAY too low. ?? Would it be too much to ask for a smaller check? Like say the optimum expected value for 10 sheep or so??

User avatar
mctrafik
Posts: 27
Joined: Thu Oct 06, 2011 5:42 am
Location: Los Angeles, California
Contact:

Re: Problem 352

Post by mctrafik » Thu Oct 06, 2011 5:55 am

I'm having a lot of problems with this one. I get the problem, I'm just not sure how 'optimal' the solution is that we're trying to get.
The example says divide into group once. The optimization says do a test on all sheep first. So does the strategy mean we're always testing the whole sample first, and then only once split them up in groups; OR the mention of levels includes a recursive solution, like take 25 sheep, test all, split into 13 and 12, test the 13, split into 7 and 6, test 7, test 6, split 12 into 6 and 6, test 6 and test the last 6?

My problem is that when there's one division into groups I get answers that are much higher than 4.15 (but obviously lower than 7.4), but when I do a full recursive solution, I get an answer that's much lower than 4.15. I can't figure out which one I'm trying to fix. Some clarification would be nice.
Image
"Nothing in this world that's worth having comes easy"

TripleM
Posts: 382
Joined: Fri Sep 12, 2008 2:31 am

Re: Problem 352

Post by TripleM » Thu Oct 06, 2011 6:11 am

You can perform whatever strategy you like, as long as it satisfies the one restriction mentioned in the problem. The methods described in the problem are solely examples.

User avatar
mctrafik
Posts: 27
Joined: Thu Oct 06, 2011 5:42 am
Location: Los Angeles, California
Contact:

Re: Problem 352

Post by mctrafik » Thu Oct 06, 2011 7:51 am

I still keep getting results that are much better than T(25, .02) = 4.15

Maybe I'm thinking about this wrong.
To me when testing 25 sheep at 2% popularity, it's clearly recursive like this:
25 into 16 and 9
16 into 4 x 4
4 into 2 x 2
9 into 3 x 3
3 into 2 and 1
I can't post results, but what I get is much less than 4.15.

Without the cool optimizations like what to do when the sample size doesn't evenly divide into the group size and assuming last positive if all previous negative in a definitely positive sample,
the math goes like this:

1 + (1 - (.98)^25) * ( // 1 group of 25
1 + (1 - (.98)^16) * ( // 1 group of 16
4 + 4 * (1 - (.98)^4) * ( // 4 groups of 4
2 + 2 * (1 - (.98)^2) * (1.02) // 2 groups of 2
)
) +
1 + (1 - (.98)^9) * ( // 1 group of 9
3 + 3 * (1 - (.98)^3) * ( // 3 groups of 3
1 + (1 - .98^2) * (1.02) + // 1 group of 2
1 // and 1 group of 1
)
)
)

which is ~2.5

Note: 1.02 is one test for the first sheep and the 2% chance that we still need to test the second one.

I don't get it. Further optimizing brings the expected number down even more. Where's the error in my logic?
Image
"Nothing in this world that's worth having comes easy"

TripleM
Posts: 382
Joined: Fri Sep 12, 2008 2:31 am

Re: Problem 352

Post by TripleM » Thu Oct 06, 2011 8:18 am

Your calculations are incorrect for the same reason as discussed earlier in this thread. I'm not really going to say more as this is very bordering on discussing solutions/approaches to the problem.

User avatar
mctrafik
Posts: 27
Joined: Thu Oct 06, 2011 5:42 am
Location: Los Angeles, California
Contact:

Re: Problem 352

Post by mctrafik » Thu Oct 06, 2011 7:57 pm

Ah... Thanks TripleM!

So you're saying before the first test it's just a Binomial(p = .02) and with the new information the distribution gets adjusted because PDF(0) = 0

To all:

I just have no clue how to adjust it. I mean I can plot it in R, but lack the knowledge to express it in a computable way.

Any hint is appreciated.

P.S. I want the posterior so I can compute the equivalent p and do a recursive call.

Edit: The posterior p has a formula. It's not very intuitive, but it is simple.
Last edited by mctrafik on Tue Oct 11, 2011 8:24 pm, edited 1 time in total.
Image
"Nothing in this world that's worth having comes easy"

quilan
Posts: 182
Joined: Fri Aug 03, 2007 10:08 pm

Re: Problem 352

Post by quilan » Fri Oct 07, 2011 12:06 am

Well this is fun. First problem I've sat down to tackle in a while and I've got a great algorithm going that I know is correct (both T(25,0.02) and T(25,0.10) are spot on the nose, and tactically it makes perfect sense). The only problem is that when I do the full problem, I get an invalid answer. When I run the same exact code on a different machine, I get different values (margin of error around +-12)! Turns out massive rounding errors in doubles are a pain in the butt to fix. Time to figure a way to simplify things to make it numerically stable.

Edit: Oops. Looks like my concurrency wasn't correctly locking the critical section. Works now!
ex ~100%'er... until the gf came along.
Image

sivakd
Posts: 217
Joined: Fri Jul 17, 2009 8:37 am
Location: California, USA
Contact:

Re: Problem 352

Post by sivakd » Sat Oct 08, 2011 3:29 am

Finally managed to get the correct values for the examples given in the problem. Now I have to figure out how to scale it for s = 10000.
Image
puzzle is a euphemism for lack of clarity

User avatar
mctrafik
Posts: 27
Joined: Thu Oct 06, 2011 5:42 am
Location: Los Angeles, California
Contact:

Re: Problem 352

Post by mctrafik » Sat Oct 08, 2011 9:01 am

Still can't get the initial values.
I must be doing something wrong.
I keep getting T(25, .02) = 4.12... and T(25, .10) = 12.39...
My thinking goes I only need to take care of two possibilities:
1 - Where we don't know if any are infected *uniform*
2 - Where we know at least one is infected
The first case is straight forward, but for the second case I literally go through all the possibilities: WhatIfOneIsInfected * ChanceOnlyOneInfected ... and sum them up.
But.. not only I highly doubt this method will scale to 10k I get wrong answers. Is there something wrong with my thinking, or should I look for subtle bugs in code?

P.S. This problem is totaling more than 25 hours already... sigh.

Edit: 50 hours into this problem. Turns out Sum WhatIfKInfected * ChanceOnlyEInfected has a closed solution, and I wasn't getting the values I expected because of a bug in the code. Still haven't solved it though.
Last edited by mctrafik on Tue Oct 11, 2011 8:26 pm, edited 1 time in total.
Image
"Nothing in this world that's worth having comes easy"

quilan
Posts: 182
Joined: Fri Aug 03, 2007 10:08 pm

Re: Problem 352

Post by quilan » Sun Oct 09, 2011 5:50 pm

mctrafik wrote:My thinking goes I only need to take care of two possibilities:
1 - Where we don't know if any are infected *uniform*
2 - Where we know at least one is infected
The first case is straight forward, but for the second case I literally go through all the possibilities: WhatIfOneIsInfected * ChanceOnlyOneInfected ... and sum them up.
But.. not only I highly doubt this method will scale to 10k I get wrong answers. Is there something wrong with my thinking, or should I look for subtle bugs in code?
I don't think it's a buster of a clue, but the second case is what threw me off for a long time. Turns out I was calculating it wrong -- there's actually a direct formula for calculating it. Can't give any more hints than that. Keep trying at it & good luck!
ex ~100%'er... until the gf came along.
Image

User avatar
mctrafik
Posts: 27
Joined: Thu Oct 06, 2011 5:42 am
Location: Los Angeles, California
Contact:

Re: Problem 352

Post by mctrafik » Sun Oct 09, 2011 8:06 pm

Thank you :)
quilan wrote: the second case is what threw me off ... there's actually a direct formula for calculating it.
Sorry for the stupid question though, but what do you mean by "it"? The expected value? Posterior probability of any single one being infected? I have a formula, too. It's just got capital sigmas and pis in it. :D Also, isn't a direct formula impossible when every call is recursive?

<-- doubting everything at this point.
Image
"Nothing in this world that's worth having comes easy"

quilan
Posts: 182
Joined: Fri Aug 03, 2007 10:08 pm

Re: Problem 352

Post by quilan » Sun Oct 09, 2011 11:26 pm

mctrafik wrote:Thank you :)
quilan wrote: the second case is what threw me off ... there's actually a direct formula for calculating it.
Sorry for the stupid question though, but what do you mean by "it"? The expected value? Posterior probability of any single one being infected? I have a formula, too. It's just got capital sigmas and pis in it. :D Also, isn't a direct formula impossible when every call is recursive?

<-- doubting everything at this point.
The second case you listed above -- the probability of a subgroup being/not-being infected, knowing that the group itself tested positive. It can be done by a very direct calculation which wasn't the one I originally thought. That threw off all of my values for the longest time even though I had the right idea.
ex ~100%'er... until the gf came along.
Image

Post Reply