A serious PHP design flaw – watch out!
Today I discovered a quite disturbing fact about the way php handles array indexing. Talk about bad language design…
When using strings that contain only numbers (and don’t start with ’0′ (zero)) as indexes, they are automatically and inevitably converted to integers. For example:
$arr = array(); $b = '1234'; $arr['xxxx'] = 'y'; $arr[$b] = 'whatever'; foreach ($arr as $key => $value) { echo gettype($key)."\n"; } |
The expected output would be:
string string |
But it is:
string integer |
Oh, wow. This is not a big thing, right? PHP does this all the time, converting numeric strings to integers and vica-versa. Right, and it’s very convenient sometimes, but array indexing is something that should be handled very sensitively. For example I discovered this inner workings of PHP while debugging a seemingly mysterious bug in my code. I was filling up an associative array keyed by strings coming from a VARCHAR field in MySQL. Nothing fancy so far. Some of these strings contained only numbers. Later on, I used some code like:
foreach ($my_arr as $str_key => $some_value) { if (strpos($some_string, $str_key) !== false) { . . . } } |
And it started producing strange results – strings being matched which evidently didn’t match – wtf is going on? Reading the docs of strpos I found that if the needle is an integer, it’s converted to characters by their ordinal values. But where did those integers come from? ARRGHH, those pesky autoconverted array keys!
To fix it:
foreach ($my_arr as $str_key => $some_value) { if (strpos($some_string, (string)$str_key) !== false) { . . . } } |
So if I ever need some facts to use in the “why I hate php” type arguments, this will be in my pocket.
55 comments
Trackbacks/Pingbacks
- ??????? » [Web] ???? - [...] A serious PHP design flaw [...]
- Pinderkent: Programming languages should not try to guess the programmer's intentions. - [...] I saw an article describing some problems within a PHP script caused by automatic conversion. Frankly, these kinds [...]
- blog@iamnolegend.com [ochronus] » A serious PHP design flaw - watch out! - [...] here: blog@iamnolegend.com [ochronus] » A serious PHP design flaw - watch out! Share and [...]
Leave a Reply
Additional comments powered by BackType


This is a well known trait of PHP. Though loosely typed it will always guess at the meaning of an integer or string based on the operation being performed.
This is why certain operations like comparison have so many operators for different levels of the same thing. = ,==, === all produce different output as each tells PHP which level of logic to use.
In your case having keys interpreted as integers is only a guess by PHP not a transformation. If you are going to use PHP you will have to get used to type casting or write your code so that it is strictly typed.
If I had one thing I could pick to hate about PHP it would be the semi-organic development of the language. There is no road map only types places that the developers would like to visit on the way.
[Reply]
ochronus Reply:
May 20th, 2009 at 06:56
I’m (unfortunately) using PHP on a daily basis, and of course I knew of its guessing. But there’s a distinction there: when I see an arithmetic operation like 3 + ’445′ it’s evident that ’445′ should be treated as an integer, also the other way, when I echo a number, php can know that it should be converted to a string. But array keying doesn’t fit into this scenario at all. PHP arrays are associative by nature.
[Reply]
Actually after looking at your code I see you have a couple of things that can be changed to give correct behaviour. First, setting ’1234′ will be a string. (‘) will set anything within them to string until an arithmetic operation is called or when type guessing is involved. 1234 (without ‘ ) is an integer all the time.
/** default keys are int starting at 0 */
$arr = array();
/** is int */
$b = 1234;
/** is string but all numbers, PHP will guess int when used as array key
since keys are int by default */
$c = ’1234′;
/** xxxx is string and changes $arr to associative array */
$arr['xxxx'] = ‘y’;
/** first key = 1234 int but remains incremental int */
$a[$b] = ‘whatever’;
$x[$c] = ‘whatever’; //incremental
$y[(string)$c] = ‘whatever’; associative
foreach ($arr as $key => $value) {
echo gettype($key).”\n”;
}
[Reply]
ochronus Reply:
May 20th, 2009 at 08:39
Thank you for taking time for this
$arr = array();
$arr['xxx'] = ‘yyy’;
$key1 = ‘1234′;
$key2 = 1234;
$arr[$key1] references the exact same item as $arr[$key2] and $arr[(string)$key1] and $arr[(string)$key2]
Following your guide I wrote this little test:
$arr = array();
$b = 1234;
$c = ‘1234′;
$arr['xxxx'] = ‘literal string item’;
$arr[$b] = ‘Coming from an integer’;
$arr[$c] = ‘Coming from a numeric string’;
$arr[(string)$c] = ‘Coming from a cast numeric string’;
$arr[(string)$b] = ‘Coming from a cast integer’;
foreach ($arr as $key => $value) {
echo gettype($key).’ – ‘.$value.”\n”;
}
The output is:
string – literal string item
integer – Coming from a cast integer
So I still can’t force PHP upon array access to make a distinction between ‘1234′ and 1234, or maybe I’m missing something.
[Reply]
Too much magic. *sigh*
[Reply]
Another problem with #PHP typecast :-< http://bit.ly/OdO1o
This comment was originally posted on Twitter
[Reply]
@Atamert Ölçgen
Oh boy, this is some serious black magic, and in the wrong sense.
[Reply]
Here’s how to deal with and never have a problem regarding this stuff:
1) Stop programming like a jack ass
2) Code like a normal human being
3) Make code organized and readable
OR
You can switch to ASP.NET, get out of the PHP business and let professionals do the work.
Either way your code example above is making a mountain out of an ant hill. I am 100% certain if you take a pencil and piece of paper and you map out your problem calmly and correctly you’ll find a much better solution that doesn’t involve hackNslash voodoo.
[Reply]
ochronus Reply:
May 20th, 2009 at 10:24
Thanks for the constructive comment ;)
1. Have you seen any of my php codes? I doubt it. The code in the article is just a quick thrown-together example.
2. See 1.
3. See 1.
Using associative array keys coming from DB fields is not hackNslash voodoo but common practice.
[Reply]
Sorry, for misleading you on that point. No, you cannot force the interpretation. But this should not be necessary because an array should not have both associative keys and incremental keys.
It does not matter if when importing strings to keys if they are numbers or not. You choose to make the keys alphanumeric from the start and PHP sees them all as such. The field in MySQL is alphanumeric and so the keys in the array match as alphanumeric.
gettype is also an unreliable way of testing for data types since it does not know the differences between data types. Gettype is only capable of categorizing what a variable contains. To properly test for data type you should use the is_* (is_int, is_string …etc.) functions.
[Reply]
Reading: A serious PHP design flaw – watch out! >> http://bit.ly/19jeoP
This comment was originally posted on Twitter
[Reply]
@Carl McDade
No, you weren’t misleading, I just wanted to state my point again :) And also really thank you for your time, I appreciate it a lot.
Coming back to my point: I really think that it sucks that $arr['1234'] and $arr[1234] mean the exact same thing to php. Those are two very different keys. One is a string, the other is an integer.
[Reply]
I had some strange behaviour of PHP code, too. I am using function strlen for string length checking. I used it in if statement like this:
if (strlen() > 0) {
// do something
} else {
// do something else
}
Do you see anything wrong with my code? Well, PHP did not complain, but evidently code didn’t produce expected result. Before I figured out, that I used strlen() with no parameter, I was thinking of shooting my computer. But documentation on php.net (http://si2.php.net/manual/en/function.strlen.php) sais nothing of default value of parameter. This is why I prefere Java over PHP!
[Reply]
ochronus Reply:
May 20th, 2009 at 11:06
Hi,
Actually PHP does notice you with a nice warning:
PHP Warning: Wrong parameter count for strlen() in …. line …
You really should turn warnings and notices on on your dev server. Of course it’s a no-go on a live system, but you should be testing stuff before going live :)
[Reply]
A serious PHP design flaw – watch out! http://bit.ly/OdO1o
This comment was originally posted on Twitter
[Reply]
A serious PHP design flaw – watch out! http://bit.ly/wcnuQ
This comment was originally posted on Twitter
[Reply]
A serious PHP design flaw – watch out! http://bit.ly/wcnuQ (via @joe_carney)
This comment was originally posted on Twitter
[Reply]
A serious PHP design flaw – watch out! http://tinyurl.com/oc2bzl
This comment was originally posted on Twitter
[Reply]
A serious PHP design flaw – watch out! http://tinyurl.com/oc2bzl
This comment was originally posted on Twitter
[Reply]
A serious PHP design flaw – watch out! http://tinyurl.com/oc2bzl (via @katharnavas)
This comment was originally posted on Twitter
[Reply]
A serious PHP design flaw – watch out! http://tinyurl.com/oc2bzl (via @katharnavas)
This comment was originally posted on Twitter
[Reply]
Essential read! RT @devongovett: A serious PHP design flaw – watch out! http://tinyurl.com/oc2bzl (via @katharnavas)
This comment was originally posted on Twitter
[Reply]
Essential read! RT @devongovett: A serious PHP design flaw – watch out! http://tinyurl.com/oc2bzl (via @katharnavas)
This comment was originally posted on Twitter
[Reply]
LOL , I get a feeling of deja vu with this. I think the same “bug/question” has been coming up over and over since PHP 4 first became popular. It is something that the core team seems to have put aside as “by design” in all cases. Probably because it leads to arguements for strong data typing.
[Reply]
@Carl McDade
Ok, I can accept that this is a “feature”, not a bug, but I still feel it’s a bad design decision :)
Just a personal opinion, nothing more.
[Reply]
Shouldn’t you array index names _not_ start with a number anyways, seems like bad coding practices.
[Reply]
ochronus Reply:
May 20th, 2009 at 16:05
What do you mean? An associative array should handle anything that can be represented as a string. I’m using the associative array as something like a cache for items coming from DB so that I don’t have to do a SELECT every time (some items need to be filtered and transformed in a way that’s not possible with joins and mysql functions). What’s bad practice in this?
[Reply]
“1) Stop programming like a jackass.” http://bit.ly/10ZPbY
This comment was originally posted on Twitter
[Reply]
I think you guys are looking at this from the wrong side. The real bug is in the design magic of strpos. I mean, how often is the int mode useful, and is it enough that we can’t be burdened with calling chr() or to throw a warning? I mean, what happens if I pass an object, array, or boolean in as needle, are they converted to an int and then used as the ordinal value?
Sometimes things should just fail. Adding this kind of magic should only happen when it is beneficial for it to happen.
[Reply]
ochronus Reply:
May 20th, 2009 at 19:37
You truly have a point there :) No good comes from the language/platform being too “smart”
[Reply]
haha, sheesh, I remember php’s hellish typing. Glad to had moved on to other languages!
I agree, it’s a nasty problem. Solution: always use a prefix like ‘id_’ or ‘structurename_’, so as to prevent disappointment.
I remember one place this prob manifested itself for me was in an XML-RPC library, which could not differentiate my id-keyed hashes from normal arrays, so those keys got lost in the translation.
As an aside, there’s a similar issue in the XML spec regarding ID attributes. Arguably that issue has better justification, but it’s still very similar.
[Reply]
Hi there, I wasn’t finished about the XML thing, but now I’ve had a few wines and I’d like to talk more … please accept my apologies if this doesn’t seem too relevant:
In an xml DTD you can specify an ID attribute for a given tag, and that ID needs to begin with non-numeric characters. Strange? Not really: that ID needs to be unique across a whole document, not just that tagname. Say, you have a some ‘company’ tags and nested inside those companies you have some ‘employee’ tags. If you’d specified id=’1′ for company 1, you could no longer use id=’1′ for an employee. Therefore, you would use id=’company_1′ for the first company and id=’employee_1′ for the first employee, whilst still being able to use document.getElementById() to isolate each of these.
So, if prefixes are good enough for XML ids, then why not for PHP hash keys?
Oh well, I’ll go back to my Java and C# and sleep soundly. Go Shatkar Donsetsk
[Reply]
The real PHP gotcha is that you might find yourself using it, and rationalise this behaviour with notions like “it can’t be that bad, everyone’s doing it”. For reference, that’s how World War 2 got started.
[Reply]
In other news: Water is wet.
This comment was originally posted on Reddit
[Reply]
This is why loose typing sucks.
This comment was originally posted on Reddit
[Reply]
Today I discovered a quite disturbing fact about the way php handles array indexing. Talk about bad language design…
This comment was originally posted on Reddit
[Reply]
I was more disturbed when I found out that calling a function without a required argument only raises a E_WARNING, but still executes the function.
This comment was originally posted on Reddit
[Reply]
That has got to be the most retarded comment ever. Switch to ASP.NET… pfft. If anything you should switch to Java not to some retarded technology for retarded programmers with retarded attitude.
[Reply]
Actually, this is not so bad. You need to know your tools… If you have read any PHP array function documentation at all, you will know that PHP treats integer indices differently. A lot of the array functions will re-index your integer-indexed arrays. This is what you would expect for array concatenation, for example. Because of the loose typing, it is reasonable to cast numerical strings to integers when used for indexing.
This comment was originally posted on Reddit
[Reply]
I am so glad that I left the PHP world 5 years ago in favour for (both) ruby and python.
[Reply]
tl;dr, the serious design flaw of PHP is PHP.
This comment was originally posted on Reddit
[Reply]
Thanks for pointing out, I didn’t know PHP had any design flaws, let alone serious ones!
This comment was originally posted on Reddit
[Reply]
PHP is the design flaw. Perl for life!
This comment was originally posted on Reddit
[Reply]
PHP has a design?
This comment was originally posted on Reddit
[Reply]
>PHP: A Serious Design Flaw fix’d
This comment was originally posted on Reddit
[Reply]
> Because of the loose typing, it is reasonable to cast numerical strings to integers when used for indexing. No it isn’t.
This comment was originally posted on Reddit
[Reply]
Your argumentation technique is lacking something.
This comment was originally posted on Reddit
[Reply]
It’s not PHP’s fault if you’re incompetent.
[Reply]
What would be my incompetence here?
[Reply]
consider the alternative. Would you like to have arrays where $foo[123] and $foo['123'] have distinct values?
This comment was originally posted on Reddit
[Reply]
I, for one, would like. 123 is an integer, which is a whole different animal than ‘123′ which is a string which just happens to be the textual representation of the same number. Or at least let me force php to use it as a string like $foo[(string)'123']
This comment was originally posted on Reddit
[Reply]
[software] A serious #PHP design flaw – watch out! http://bit.ly/RIMuO
This comment was originally posted on Twitter
[Reply]
[software] A serious #PHP design flaw – watch out! http://bit.ly/RIMuO
This comment was originally posted on Twitter
[Reply]
I find this piece of code even more amusing. And dont tell me you expect this to happen:
$a = false; // or ”
var_dump($a);
$b = &$a['foo'];
var_dump($a);
false
Array {
‘foo’ => null
}
[Reply]
@Rob What an astonishingly rude an ill-considered comment. No “professional” programmer would consider being so arrogant. PHP has serious typing problems and is seriously limited as an OO language. Some of us are professional programmers who have to work in PHP and work around its inadequacies every day. Comments like yours show a serious lack of understanding.
[Reply]