herenvardo

I've noticed some strange behaviour when using the C#'s 'is' operator.

As stated in http://msdn.microsoft.com/library/default.asp url=/library/en-us/csref/html/vclrfispg.asp:

An is expression evaluates to true if both of the following conditions are met:

  • expression is not null.
  • expression can be cast to type. That is, a cast expression of the form (type)(expression) will complete without throwing an exception. For more information, see 7.6.6 Cast expressions.

The following expression, however, returns 'false':

0 is byte

Since 0 is not null, and (byte)0 casts successfully without any kind of exception, shouldn't it return true Just for your consideration.

By the way, the expression '(byte)0 is byte' returns true. Said in other words: when checking if you can cast it, the operator will answer 'not', but if you actually cast it and check it again, the operator will answer 'yes'. That's completely awfull and is counter-productive and counter-productive: the main usage of this operator is to check if something can be cast; having to cast in order to check it makes the operator almost useless!



Re: Visual C# Express Edition Weird behaviour of the 'is' operator. Is this a bug?

Geert Verhoeven

Hi,

Indeed strange behavior. I guess that for the statement 0 is byte, 0 is interpreted as an int which can not be cast to an int without losing data.

If you try the code below, you will see that the screen will display Int32 as a result:

object x = 0;

if (x is byte)
{
   Console.WriteLine("byte"
);
}
else
{
   Console
.WriteLine(x.GetType().Name);
}
Console.ReadLine();

EXTRA INFO (C# Language Specification):

1.1.1 The is operator

The is operator is used to dynamically check if the run-time type of an object is compatible with a given type. The result of the operation e is T, where e is an expression and T is a type, is a boolean value indicating whether e can successfully be converted to type T by a reference conversion, a boxing conversion, or an unboxing conversion. The operation is evaluated as follows:

         If the compile-time type of e is the same as T, or if an implicit reference conversion ( 6.1.4) or boxing conversion ( 6.1.5) exists from the compile-time type of e to T:

o        If e is of a reference type, the result of the operation is equivalent to evaluating e != null.

o        If e is of a value type, the result of the operation is true.

         Otherwise, if an explicit reference conversion ( 6.2.3) or unboxing conversion ( 6.2.4) exists from the compile-time type of e to T, a dynamic type check is performed:

o        If the value of e is null, the result is false.

o        Otherwise, let R be the run-time type of the instance referenced by e. If R and T are the same type, if R is a reference type and an implicit reference conversion from R to T exists, or if R is a value type and T is an interface type that is implemented by R, the result is true.

o        Otherwise, the result is false.

         Otherwise, no reference or boxing conversion of e to type T is possible, and the result of the operation is false.

Note that the is operator only considers reference conversions, boxing conversions, and unboxing conversions. Other conversions, such as user defined conversions, are not considered by the is operator

Greetz,

Geert

 

Geert Verhoeven
Consultant @ Ausy Belgium

My Personal Blog






Re: Visual C# Express Edition Weird behaviour of the 'is' operator. Is this a bug?

nobugz

Explicit conversion operators, such as used to cast an int to a byte are the rub here. Here's a contrived example:

public struct A {
public int field;
}
public struct B {
public static explicit operator B(A value) {
B temp;
temp.field = Convert.ToByte(value.field);
return temp;
}
public byte field;
}

This code is now valid, invoking the explicit conversion operator:

A a;
a.field = 256;
B b = (B)a;

If will however throw an OverFlow exception, the "is" operator implicitly promises that the cast will *not* throw an exception. The implementation for the "is" operator therefore cannot consider explicit conversion operators.





Re: Visual C# Express Edition Weird behaviour of the 'is' operator. Is this a bug?

herenvardo

First of all I want to state out that I was just commenting what I think is, at least, curious, and not actually asking anything (I made sure not to mark my post as a question)... I've been programming in many languages through many years, and I'm very used to the idea of old plain-c compilers that a number literal was to be taken as of the smallest type able to represent it... it seems that C# will take any integral literals as a 32-bit integer by default, even if 8 bits are more than enough to host it. Although I don't like such a waste, I guess is good knowing it... so I'll put (byte) before my literal 0's so I save 3 bytes on each case :P

The point of this behaviour surprising me is that I would have never expected the 0 being taken as an int when it could be perfectly handled as either a byte or sbyte. Thus, it surprised me that, taking it literally, 0 is not a byte. Actually, if you represent it as 00000000 it's actually a byte, while if you take it as 00000000000000000000000000000000 of course it's not! What leads to another interesting thought: it seems that, in C#, 0s at the left do matter XD.

Many people on the web says that C# was designed 'to make life easier to compilers'. Although I don't share this viewpoint, since I know that C# was designed to integrate as best as possible with the whole .Net thing, I must say that in cases like this, it looks like if it had been designed in such way.

To end that bunch of useless verbose I'm writting, I'll comment that I'm actually very new with C#. I'm doing my first project in that language, and it's itself the most complex and ambitious project I've ever started... just because I love challenges and experimenting. This kind of surprises are a bit of fun within the hard work I'm doing ^^ .





Re: Visual C# Express Edition Weird behaviour of the 'is' operator. Is this a bug?

nobugz

C# doesn't behave any different than the 'C' language. A literal is represented in the "native" type of the compiler. In C#, an numeric literal is "int", unless it contains a decimal point which makes it "double". Unless it overflows the value range of an int, automatically promoting it to a long. Quotes make it a string. You've got post-fixes to force the compiler to recognize the literal as a different type, "F" for float, "M" for decimal, etc. The amount of storage needed for a literal is inconsequential until you assign it to a variable. Something like this just doesn't save any bytes at all:
int value = (byte)0.

An int takes 4 bytes, no matter what the literal value is.

Observe the compiler error message when you do this:
byte value = 256;






Re: Visual C# Express Edition Weird behaviour of the 'is' operator. Is this a bug?

herenvardo

If you don't mind, I'll quote some parts of your post to reply to them separately. Please note that neither my previous post nor this one is actually very serious, but I think a bit of joking from time to time is very healthy.

"C# doesn't behave any different than the 'C' language." Really I thought they where different languages! Ok, I'm taking it out of context... but I'll get back to this at the end, in a more 'in-context' reply.

"In C#, an numeric literal is "int", unless it contains a decimal point which makes it "double"" (Should I have escaped the quotes within this quoted quote Have you noticed how much room gives English for word-games ) In C, a numeric integral has no type at all... it goes more in the way of what you said later: "The amount of storage needed for a literal is inconsequential until you assign it to a variable." In C, is the context in which you use the literal which somehow 'assigns' some typing to it. And it is done in a very different way in C than in C#... you'll see a clear example at the end of this post.

"Something like this just doesn't save any bytes at all:
int value = (byte)0" Of course it doesn't... but if you do 'object value = (byte)0' you should be using, at least in theory, less space than if you do 'object value = (int)0' (they would both waste space for the reference, and then either one or four more bytes for the actual value). And it seems that in C# 'object value = 0' behaves like 'object value = (int)0'... what a pity that C doesn't have any equivalent to C#'s 'object' type to take a look at it... to represent 'any value' C usually used the void* type, but that forces us to use pointers, and the ANSI-C standard fortunatelly doesn't mess up with pointers to literals (which are actually an absurd thing)... however, some compilers (such as the oldest Turbo-C's that appeared before the standarization by ANSI) allowed something like:
void* p = (void*)&0; Although normally it wouldn't be allowed to take the address of a literal, these 'granny' compilers allowed such notation abusing for initializing pointers, allocating the literal value and returning the allocated address (this had some memory leak hazards to consider, but that'd go out of topic). Do you dare to guess which type was used to allocate the literal As stated in the docs, "the smallest type that both matches the context and can store the value", which in this case could either be "unsigned char" or "signed char", and it was actually the same that the default "char" for the compiler (it would be forced to unsigned if the literal where negative). This is probably the nearest equivalent to 'object' the way I used it above.

"An int takes 4 bytes, no matter what the literal value is." Actually, I love that feature of C#. I can be sure that an int takes 4 bytes. Don't even dare to say such a thing when speaking about C, because not even the ANSI-C standard fix the size of an int. You can only assume that a short will never be larger than an int, and the later will never be larger than a long. But if a compiler uses 16 bits for all of these three types, it would still be standard-compliant! And a compiler could even define that a short is a byte, an int is a kilobyte and a long a megabyte... this is of course very unlikely to happen, but it would be still compliant with ANSI and ISO standards for C.

So, do you still believe that "C# doesn't behave any different than the 'C' language." If so, let me quote your last words to make a last comparison:

"Observe the compiler error message when you do this:
byte value = 256;"
Ok. I observed it. Both in C# and C compilers. The former crashes because "256 cannot be converted to byte" (I'm translating it from my Spanish version of VC# Express), while the later just says, as I expected, "undefined symbol 'byte'". But that message has nothing to do with the actual value, so I changed the line to:
unsigned char value = 256; And made sure that the compiler was old enough to use 8 bits for chars. Know what The compiler gave no error! Just a warning saying "conversion may lose significant data". To know exactly what was being lost, I've added a printf("%d", value); after this. It gave me an output of 0. And to make it more illustrative, I tried a few other values: for 257 the output was 1, and for 356 the output was 100. In all cases it tossed me the warning, but there was no error. The compiler just takes the lowest bits of the value, as many as can be fit in the variable, and tosses the warning... ah, the old C... not even C++ was so simple on these cases...

By the way, an annecdotic fact: this is a typical joke that was done to many C beginners, in the old (386 & 486) days. As their first exercise, they were asked to build a program that printed the numbers from 1 to 100 000 in the standard output... easy, isn't it Most of them tried a code like that:
#include <stdio.h>
int main() {
for(int i=1; i<=100000; i++) printf("%d", i); // some people used the while approach instead of for, but got the same result.
}
it's so simple, it works, of course! Does it With modern C compilers it would probably work, but on these days... oh, you should have seen their faces when they looked at the screen and saw the never-ending loop that, after printing 32767 printed -32768 and continued counting, just to fall back again to -32768 whenever it hit the 32767 value... I must say that this exercise was very educative XD... after this many newbies used only longs for a long while XD You can achieve the same loop in C# by using 'short', but even so the compiler will toss you a warning giving a good clue about why does it fail.

As a final thought about C# and C I want to state this:
- C is a low-level, structured-paradigm, weak-typed language, mainly aimed to programming systems, where 'escaping' the type system may be needed in some cases.
- C# is a high-level, object-oriented plus component-oriented paradigm, strong-typed language, aimed to take the most of the .NET framework to solve the modern needs of GUI, desktop and web applications (yes, it supports console applications, but it's not the main aim of the language).
So please, don't say "C# doesn't behave any different than the 'C' language.".