mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Puzzles (https://www.mersenneforum.org/forumdisplay.php?f=18)
-   -   What is the fastest algorithm and its computational complexity to the subset-(bit) AND problem? (https://www.mersenneforum.org/showthread.php?t=21527)

Raman 2016-08-26 14:24

What is the fastest algorithm and its computational complexity to the subset-(bit) AND problem?
 
1 Attachment(s)
[b][u]Problem Statement[/u][/b]
I am given some integers like this
828, 753, 239, 873, 126, 923, 303, 317
and I have to find out the size of the smallest subset such that their (bit) AND is 0.
(There may be hundreds of integers and the integers may be hundreds of bits long)
What is the efficient algorithm to solve this problem along with its computational complexity?

This problem is in line with the subset sum problem which looks for a proper subset that sums to 0.
Its decision problem is, by itself, NP-complete, but the decision version of the subset-(bit) AND problem is trivial and can be done in O(n) time by applying (bit) AND over all integers and a proper subset exists if the (bit) AND over all the integers is 0.
I am interested in finding out an algorithm for the size of the smallest subset such that their (bit) AND is 0.

science_man_88 2016-08-26 15:02

[QUOTE=Raman;440732][b][u]Problem Statement[/u][/b]
I am given some integers like this
828, 753, 239, 873, 126, 923, 303, 317
and I have to find out the size of the smallest subset such that their (bit) AND is 0.
(There may be hundreds of integers and the integers may be hundreds of bits long)
What is the efficient algorithm to solve this problem along with its computational complexity?

This problem is in line with the subset sum problem which looks for a proper subset that sums to 0.
Its decision problem is, by itself, NP-complete, but the decision version of the subset-(bit) AND problem is trivial and can be done in O(n) time by applying (bit) AND over all integers and a proper subset exists if the (bit) AND over all the integers is 0.
I am interested in finding out an algorithm for the size of the smallest subset such that their (bit) AND is 0.[/QUOTE]

though I'm not sure I get how it relates to your motivation, my first thought on the problem is have a second vector of the hammingweights ( number of 1's in the binary representation for binary numbers at least) if the hammingweight sums don't hit another in the vector of hammingweights ( you can work off partitions of numbers below roughly 73 in the gp console) then you know they can't be part of a minimal subset of numbers that work. edit: okay this works more of a first step because it tells you 1) if you have two numbers that could simply sum to a third in a way that doesn't overlap, and 2) it tells you the that if none of the elements of the original with the same hamming weight are equal that the minimum hammingweight for a start number to work from is the third lowest hamming weight.

science_man_88 2016-08-26 16:55

[QUOTE=science_man_88;440735]though I'm not sure I get how it relates to your motivation, my first thought on the problem is have a second vector of the hammingweights ( number of 1's in the binary representation for binary numbers at least) if the hammingweight sums don't hit another in the vector of hammingweights ( you can work off partitions of numbers below roughly 73 in the gp console) then you know they can't be part of a minimal subset of numbers that work. edit: okay this works more of a first step because it tells you 1) if you have two numbers that could simply sum to a third in a way that doesn't overlap, and 2) it tells you the that if none of the elements of the original with the same hamming weight are equal that the minimum hammingweight for a start number to work from is the third lowest hamming weight.[/QUOTE]

doh scratch at least most of that I realized you need the opposite of the hammingweight because the number of 0's need to cancel out the number of 1's in the number you start with. but it works similarly.

R. Gerbicz 2016-08-26 19:18

[QUOTE=Raman;440732][b][u]Problem Statement[/u][/b]
I am given some integers like this
828, 753, 239, 873, 126, 923, 303, 317
and I have to find out the size of the smallest subset such that their (bit) AND is 0.
(There may be hundreds of integers and the integers may be hundreds of bits long)
What is the efficient algorithm to solve this problem along with its computational complexity?
[/QUOTE]

Nice problem! You can formulate this with a binary integer program: Say you have n integers in the set: a_1,..,a_n, and the maximal bitlength is L. Then the program that solves the problem:

min sum_{i=1}^L x_i

s.t. sum_{i is in S_k} x_i>=1, for all k=0,..,L-1
where t is in S_k iff the k-th bit of a_t is 0.

You can get the optimal set: choose a_i iff x_i=1. This shows that your problem is (maybe) easily solvable when say n<100.

And an easy (trivial) upper bound: if there is a solution then the smallest subset size is at most L. (it is only "interesting" if L<n). And with a greedy algorithm you can get that (but it can be not optimal...)

CRGreathouse 2016-08-26 19:21

This is the [url=https://en.wikipedia.org/wiki/Set_cover_problem]set cover problem[/url], one of Karp's original 21 NP-complete problems. All known exact algorithms are exponential, and all known approximations are off by a logarithmic factor in the worst case: if you have a thousand numbers, approximate answers might require > 7 times the minimum required number of integers. Improving on either of these would require disproving a well-known conjecture: ETH and P ≠ NP, respectively.

Raman 2016-08-30 05:55

[QUOTE=CRGreathouse;440764]This is the [URL="https://en.wikipedia.org/wiki/Set_cover_problem"]set cover problem[/URL], one of Karp's original 21 NP-complete problems. All known exact algorithms are exponential, and all known approximations are off by a logarithmic factor in the worst case: if you have a thousand numbers, approximate answers might require > 7 times the minimum required number of integers. Improving on either of these would require disproving a well-known conjecture: ETH and P ≠ NP, respectively.[/QUOTE]

Thank you. [COLOR=White]- do - [/COLOR][COLOR=White]- do - [/COLOR][COLOR=White]- do - [/COLOR][COLOR=White]How old are you?[/COLOR]
[COLOR=White]- do -[/COLOR] [COLOR=White]- do - [/COLOR][COLOR=White]- do - [/COLOR][COLOR=White]- do -[/COLOR]
It appears that the optimization version of the set cover problem is NP-hard and the decision version of the set cover problem is NP-complete respectively.
[COLOR=White]- do -[/COLOR] [COLOR=White]- do - [/COLOR][COLOR=White]- do - [/COLOR][COLOR=White]- do -[/COLOR]
If I - do - understand correctly, then even for NP-complete problems, most instances should be solvable with in (deterministic) polynomial running time. I am not sure, if whether for my use case, algorithms with in running time of (deterministic) superpolynomial, subexponential, quasipolynomial or pseudopolynomial do exist.
[COLOR=White]- do -[/COLOR] [COLOR=White]- do - [/COLOR][COLOR=White]- do - [/COLOR][COLOR=White]- do -[/COLOR]
Given that the optimization / decision version of the set cover problem is NP-hard / NP-complete respectively,
for sure,
(Universe = U, Given family of subsets of universe = S, Subsets of S needed to cover universe = C)
1. It is possible to dissolve a set from S which is a proper subset of another set from S. Also, if one (or more?) sets from S union a greater number of sets from S completely, it is also possible to dissolve them completely.
2. As we can see, from within our sets, the total number of times each element appears remains fixed, and is equal to the total number of categories within the following sets made only.
3. If we are looking for a size of the set cover ≤ k, where k is small, then we can go by searching for all subsets of S of size k starting from one sequentially.
4. If elements of U occur only few times with in S, then we can go by searching for all subsets of S from fixing the few sets from S containing the least frequently occurring elements of U, and then scanning through all the remaining sets from S for the remaining elements from U which do occur further more frequently with in S than the other elements from U which do occur less often.
5. May be convert it into graph problem or integer linear programming problem. For the case of a graph, it is a bipartite graph, with one set of vertices (L / left) being all elements from U, and the other set of vertices (R / right) being all given sets of subsets of universe from S. An edge does exist from L to R for all the corresponding sets from R with in S to the appropriate elements from L with in U. This is a special case of the vertex cover problem, that which where by we are looking for the size of the smallest subset of vertices from with in R to be covering all the vertices from with in L.

CRGreathouse 2016-08-30 17:57

[QUOTE=Raman;441048]If I - do - understand correctly, then even for NP-complete problems, most instances should be solvable with in (deterministic) polynomial running time.[/QUOTE]

That's not believed to hold. This is essentially the statement of Impagliazzo's Heuristica, where we are believed to be in Cryptomania. (See [url=http://cseweb.ucsd.edu/~russell/average.ps]A personal view of average-case complexity[/url].)

But it's worse than that: not only do we not believe we can solve general NP-complete problems in polytime, we're working with a problem without a PTAS. Worse still, we're outside APX (assuming P does not equal NP): as problem sizes increase, our ability to approximate answers gets worse and worse, see [url=http://dl.acm.org/citation.cfm?id=258641]Raz & Safra[/url].

Now this doesn't mean that we can't solve or closely approximate instances you care about, because those problem instances may be special. But I certainly wouldn't assume that to be the case!

If you're interested in better approximations it looks like [url=https://arxiv.org/abs/0810.4934]Exponential-time approximation of hard problems[/url] may be of interest.

[QUOTE=Raman;441048]I am not sure, if whether for my use case, algorithms with in running time of (deterministic) superpolynomial, subexponential, quasipolynomial or pseudopolynomial do exist.[/QUOTE]

Pseudopolynomial basically means exponential. :wink:

I'm not aware of any subexponential (exact) algorithms for set cover.

[QUOTE=Raman;441048]5. May be convert it into graph problem or integer linear programming problem. For the case of a graph, it is a bipartite graph, with one set of vertices (L / left) being all elements from U, and the other set of vertices (R / right) being all given sets of subsets of universe from S. An edge does exist from L to R for all the corresponding sets from R with in S to the appropriate elements from L with in U. This is a special case of the vertex cover problem, that which where by we are looking for the size of the smallest subset of vertices from with in R to be covering all the vertices from with in L.[/QUOTE]

Yes, you can "reduce" the problem to solving an exponentially-large instance of vertex covering.

[QUOTE=Raman;441048]1. It is possible to dissolve a set from S which is a proper subset of another set from S. Also, if one (or more?) sets from S union a greater number of sets from S completely, it is also possible to dissolve them completely.
2. As we can see, from within our sets, the total number of times each element appears remains fixed, and is equal to the total number of categories within the following sets made only.[/QUOTE]

Yes. If the instance is easy, it can be solved easily.

[QUOTE=Raman;441048]3. If we are looking for a size of the set cover ≤ k, where k is small, then we can go by searching for all subsets of S of size k starting from one sequentially.[/QUOTE]

Yes. Again, if it's easy then it's easy. But this grows pretty quickly -- even for just a set of size 100, k = 20 is quite infeasible.

Batalov 2016-08-30 19:23

[QUOTE=Raman;441048]Thank you. [COLOR=White]- do - [/COLOR][COLOR=White]- do - [/COLOR][COLOR=White]- do - [/COLOR][COLOR=White]How old are you?[/COLOR]
[COLOR=White]- do -[/COLOR] [COLOR=White]- do - [/COLOR][COLOR=White]- do - [/COLOR][COLOR=White]- do -[/COLOR][/QUOTE]
This BS has got to stop.

jwaltos 2016-08-30 20:57

These are puzzling but may be considered more of a challenge than puzzle.

1. I lifted this from mathoverflow, " It is a well-known open problem in TCS to identify the real complexity class of integer factorization."
[url]http://mathoverflow.net/questions/79366/evidence-for-integer-factorization-is-in-p[/url]

2. A Desperate appeal! (by Richard K. Guy)... deadline is September 30, 2016
[url]http://www.mersenneforum.org/showthread.php?goto=newpost&t=18640[/url]

CRGreathouse, you provided a very nice response within this post!
Because of the quality of the response I would like to solicit an opinion regarding the complexity class to which a general form of integer factorization will belong which I presume includes the integer form of complex values.

In my experience, many different opinions on a single topic is like a bad wind where no one really knows anything. Anyone who has studied this question in depth must agree on certain provable key points beyond which only conjecture and speculation thrive. Are there any papers extant explicitly stating the boundary conditions of this question that separate fact from fiction?

What is sought is probably not ``Wiki-able.``


Regarding, 2., I'm in also.

a1call 2016-08-30 20:59

[QUOTE=Batalov;441103]This BS has got to stop.[/QUOTE]
I missed that. I wonder how you could notice.
Obviously unintentional (though very curious), but the consequences can be severe. I recall Google once banning BMW's main site from its SERPS for hidden text.

[URL]https://support.google.com/webmasters/answer/66353?hl=en[/URL]

I recommend editing out all the hidden text ASAP.

CRGreathouse 2016-08-30 21:22

[QUOTE=jwaltos;441107]CRGreathouse, you provided a very nice response within this post!
Because of the quality of the response I would like to solicit an opinion regarding the complexity class to which a general form of integer factorization will belong which I presume includes the integer form of complex values.[/QUOTE]

Thanks!

Integer factorization is probably the most famous of the NP-intermediate class: in NP ("easy to check") but conjecturally outside P ("easy") and hence not NP-complete. The evidence that it's not NP-complete is compelling: if it were NP-complete that would imply the collapse of NP and co-NP ("easy to disprove") since AKS shows that primality is in P. The evidence that it is not in P is weaker, essentially our inability to find a fast algorithm. Still, the smart money is on NP-intermediate.

I don't think anyone thinks it's in RP ("easy with randomness").

It's also known to be in BQP ("easy with quantum computers") which probably ends up telling us more about the problem than any of the other classes.


All times are UTC. The time now is 05:28.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.