OneFlow中的错误处理:Maybe
撰文 | 李新奇、twice、姚迟
1
C++ 中错误处理的困境
异常 函数返回错误码
异常
return add_rainbow(
make_smaller(
make_eyes_sparkle(
add_bow_tie(
crop_to_cat(img))));
}
try {
return add_rainbow(
make_smaller(
make_eyes_sparkle(
add_bow_tie(
crop_to_cat(img))));
catch (...) {
return nullptr;
}
}
函数返回错误码
if(y == 0){
return 错误码;
}
return x / y;
}
auto cropped = crop_to_cat(img);
if (!cropped) {
return nullptr;
}
auto with_tie = add_bow_tie(*cropped);
if (!with_tie) {
return nullptr;
}
auto with_sparkles = make_eyes_sparkle(*with_tie);
if (!with_sparkles) {
return nullptr;
}
return add_rainbow(make_smaller(*with_sparkles));
}
const char* msg) {
string r("Non-OK-status: ");
r += msg;
r += " status: ";
r += v.ToString();
// Leaks string but this is only to be used in a fatal error message
return new string(r);
}
inline tensorflow::string* TfCheckOpHelper(::tensorflow::Status v,
const char* msg) {
if (v.ok()) return nullptr;
return TfCheckOpHelperOutOfLine(v, msg);
}
#define TF_DO_CHECK_OK(val, level) \
while (auto _result = ::tensorflow::TfCheckOpHelper(val, #val)) \
LOG(level) << *(_result)
#define TF_CHECK_OK(val) TF_DO_CHECK_OK(val, FATAL)
#define TF_QCHECK_OK(val) TF_DO_CHECK_OK(val, QFATAL)
output_alloc_attr(index));
TF_CHECK_OK(s);
能够被 TF_CHECK_OK 做检查的方法,只能返回 Status 类
TF_CHECK_OK 包裹后的方法调用,只能被当作语句使用,如 TF_CHECK_OK(Foo(…));,而不能作为表达式继续参与运算(如 const auto data = TF_CHECK_OK(Foo(…)) 是错误的)。这其实是上一个限制的衍生问题。
2
Haskell 中的优雅处理方式:Just Return
if y /= 0 -- 如果 y 不等于 0
then Just(div x y) -- 直接返回 x/y 的结果
else Nothing -- 返回 Nothiing
safediv :: Integral a => a -> a -> Maybe a
Just 5
Nothing
5
*** Exception: Maybe.fromJust: Nothing
CallStack (from HasCallStack):
error, called at libraries\base\Data\Maybe.hs:148:21 in base:Data.Maybe
...
3
OneFlow 中的 Maybe
使用 OneFlow Maybe
CHECK_NE_OR_RETURN(y, 0) << "y cannot be zero";
return x / y;
}
https://github.com/Oneflow-Inc/oneflow/blob/master/oneflow/core/common/maybe.h
实现原理
Maybe<void>:它对应了原 void 返回类型,等同于 TensorFlow 的 Status 类。
Maybe<ClassType>:用户自定义数据类型(类/结构体)。当数据从 Maybe 中取出时,类型为 shared_ptr<ClassType>。
Maybe<ScalarType>:C++ 中的标量数据类型,当数据从 Maybe 中取出时,类型为 ScalarType 本身。
Maybe<ReferenceType>:C++ 中的引用数据类型,当数据从 Maybe 中取出时,类型为引用本身。
IsOk
()
:Maybe
中是否有正常数据error()
:获取错误信息Data_YouAreNotAllowedToCallThisFuncOutsideThisFile()
:获取正常流程的数据,之所以起这么复杂的名字,就是要疏远用户,防止用户直接调用
#define JUST(...) \
::oneflow::private_details::RemoveRValConst(({ \
auto&& value_to_check_ = __JustStackCheckWrapper__(__VA_ARGS__); \
if (!::oneflow::private_details::JustIsOk(value_to_check_)) { \
return ::oneflow::private_details::JustErrorAddStackFrame( \
::oneflow::private_details::JustGetError(value_to_check_), __FILE__, __LINE__, \
__FUNCTION__, OF_PP_STRINGIZE(__VA_ARGS__)); \
} \
std::forward<decltype(value_to_check_)>(value_to_check_); \
})).Data_YouAreNotAllowedToCallThisFuncOutsideThisFile()
#define JUST(...) \
({ \
auto&& value_to_check_ = __VA_ARGS__; \
if (!value_to_check_.IsOk()) { \
auto* stack_frame = value_to_check_.error()->add_stack_frame(); \
stack_frame->set_file(__FILE__); \
stack_frame->set_line(__LINE__); \
stack_frame->set_function(__FUNCTION__); \
stack_frame->set_error_msg(OF_PP_STRINGIZE(__VA_ARGS__)); \
return value_to_check_.error(); \
} \
value_to_check_; \
}).Data_YouAreNotAllowedToCallThisFuncOutsideThisFile()
可以看到,被 JUST 包括的函数调用,其返回结果(Maybe<T> 类型),会先存为 value_to_check,然后对其是否发生错误做检查 if(!value_to_check_IsOK()),如果发生了错误,则记录出错栈的信息,并直接返回错误。
float x2 = JUST(sqrt(x1));
float x3 = JUST(div(1, x2));
return x3;
}
JUST 链构建的错误栈
所有函数不得以 Maybe 作为输入参数;
对于所有以 Maybe 为返回值的函数,其调用都必须被 JUST (或者 CHECK_JUST等 OneFlow 错误检查机制提供的宏)包裹。
input = flow.randn(2,2)
index = flow.randn(2,2) # 类型错误
flow.gather(input, 0, index)
64 index.shape[i] <= input.shape[i]
65 ), "index.size(d) <= input.size(d) for all dimensions d != dim"
---> 66 return flow._C.dim_gather(input, index, dim=dim)
67
68
CheckFailedException:
File "oneflow/oneflow/core/framework/op_interpreter/op_interpreter_util.cpp", line 139, in Dispatch<oneflow::one::Tensor>
Dispatch<TensorTuple>(op_expr, inputs, ctx)
File "oneflow/oneflow/core/framework/op_interpreter/op_interpreter_util.cpp", line 131, in Dispatch<oneflow::one::TensorTuple>
Dispatch(op_expr, inputs, outputs.get(), ctx)
File "oneflow/oneflow/core/framework/op_interpreter/op_interpreter.cpp", line 94, in Apply
internal_->Apply(op_expr, inputs, outputs, ctx)
File "oneflow/oneflow/core/framework/op_interpreter/eager_mirrored_op_interpreter.cpp", line 139, in NaiveInterpret
user_op_expr.InferPhysicalShapeAndDType( attrs, device_tag ... TensorMeta* { return output_tensor_metas->at(i); })
File "oneflow/oneflow/core/framework/op_expr.cpp", line 436, in InferPhysicalShapeAndDType
dtype_infer_fn_(&infer_ctx)
File "oneflow/oneflow/user/ops/dim_gather_op.cpp", line 50, in operator()
Check failed: IsIndexDataType(index.data_type())
JUST
有关约定的。4
总结
OneFlow 还基于 LLVM 构建了静态分析工具,用于确保开发者按照约定正确使用 OneFlow Maybe<T>。
参考资料
Google C++ Style Guide 关于异常的总结: https://google.github.io/styleguide/cppguide.html#Exceptions LLVM 中的错误处理: https://llvm.org/docs/ProgrammersManual.html#error-handling Haskell Wikibook 关于 Maybe 的介绍: https://en.wikibooks.org/wiki/Haskell/Libraries/Maybe Haskell Wikibook 关于 Monad 的介绍: https://wiki.haskell.org/Monad Rust 中的 ? operator: https://doc.rust-lang.org/reference/expressions/operator-expr.html#the-question-mark-operator C++ 中的 std::optional: https://en.cppreference.com/w/cpp/utility/optional C++ optional monatic operations 提案: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p0798r6.html