Finding shortest subarray with given value: many queries

Arrangements, sorting, packing, partitions, critical path analysis, networks, graphs, ...
Post Reply
wrongrook
Posts: 52
Joined: Sat Oct 17, 2009 9:39 pm

Finding shortest subarray with given value: many queries

Post by wrongrook » Wed Nov 08, 2017 7:40 pm

Background

I came across this interesting stack overflow question:

https://stackoverflow.com/questions/471 ... 3_47111546

Question

There is an array n = 1 million in length containing positive integers.

There are also n = 1 million queries, where each query has a value x and wants to know "what is the shortest length of a subarray with total sum greater than x?"

Has any one any idea on how to solve this with better than O(n^2) complexity?

(Note that I don't know what competition the question is from, so it is possible that there were some additional constraints that make the problem easier)

Thoughts

My first idea is to try and compute the greatest value achievable for each length of subarray. Once we have this it is easy to answer any query by bisection.

My second idea is to try and solve the problem by divide and conquer. i.e. split the array in half and for each half prepare a list of greatest value for each possible length, then merge the lists together.

The problem with this is that we also need to consider cases when the subarray crosses the border. By preparing a cumulative sum array it is easy to measure the content of any particular subarray in O(1), but there are still O(n^2) crossing cases to consider.

At this point I am stuck - anyone have any bright ideas for making progress? (Feel free to post an answer to the stack overflow question or post here)

ironman353
Posts: 11
Joined: Thu Jan 16, 2014 2:29 pm

Re: Finding shortest subarray with given value: many queries

Post by ironman353 » Mon Jan 15, 2018 10:43 am


traxex
Posts: 52
Joined: Thu Oct 19, 2017 12:30 pm

Re: Finding shortest subarray with given value: many queries

Post by traxex » Mon Jan 15, 2018 11:15 am

The StackOverflow question actually already linked to an efficient solution to the variation where just a single query needs to be answered.

But when the array stays constant and we need to answer many queries about it, can we do better than running the same algorithm from scratch each time?

I've thought about this problem a few times since wrongrook posted it here and haven't come up with anything. But I wouldn't be surprised to hear that there is a simple solution; these problems can be very tricky.

One example that highlights some tricky cases is the array [7, 0, 0, 4, 4, 0, 0, 3, 0, 6]. For sum >= 7 the shortest subarray is [7], for sum >= 8 it is [4, 4], and for sum >= 9 it is [3, 0, 6]. All of these subarrays are disjoint, and none of them contains answers for any smaller sums. So any simple idea of building the answer to sum >= x from the answers to sums < x is probably not going to work.
Technically, everyone is full of himself.

Post Reply